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PREFACE 


Is it fun to solve problems, and is solving problems about something a good 
way to learn something? The answers seem to be yes, provided the problems 
are neither too hard nor too easy. 

The book is addressed to students (and teachers) of undergraduate linear 
algebra—it might supplement but not (I hope) replace my old Finite- Dimen- 
sional Vector Spaces. It largely follows that old book in organization and level 
and order—but only “largely”—the principle is often violated. This is not a 
step-by-step textbook—the problems vary back and forth between subjects, 
they vary back and forth from easy to hard and back again. The location of a 
problem is not always a hint to what methods might be appropriate to solve 
it or how hard it is. 

Words like “hard” and “easy” are subjective of course. I tried to make 
some of the problems accessible to any interested grade school student, and 
at the same time to insert some that might stump even a professional expert 
(at least for a minute or two). Correspondingly, the statements of the prob- 
lems, and the introductions that precede and the solutions that follow them 
sometimes laboriously explain elementary concepts, and, at other times as- 
sume that you are at home with the language and attitude of mathematics 
at the research level. Example: sometimes I assume that you know nothing, 
and carefully explain the associative law, but at other times I assume that the 
word “topology”, while it may not refer to something that you are an expert 
in, refers to something that you have heard about. 

The solutions are intrinsic parts of the exposition. You are urged to look 
at the solution of each problem even if you can solve the problem without 
doing so—the solution sometimes contains comments that couldn’t be made 
in the statement of the problem, or even in the hint, without giving too much 
of the show away. 
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I hope you will enjoy trying to solve the problems, I hope you will learn 
something by doing so, and I hope you will have fun. 
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HINTS 


Chapter 1. Scalars 


Hint 1. Write down the associative law for [+]. 
Hint 2. Same as for Problem 1: substitute and look. 
Hint 3. How could it be? 

Hint 4. Note the title of the problem. 


Hint 5. The affine transformation of the line associated with the real num- 
bers a and £ is the one that maps each real number € onto a£ + 2. 


Hint 6. Does it help to think about 2 x 2 matrices? If not, just compute. 


Hint 7. Let rg (for “reduce modulo 6”) be the function that assigns to each 
non-negative integer the number that’s left after all multiples of 6 are thrown 
out of it. Examples: rg(8) = 2, re(183) = 3, and r¢(6) = 0. Verify that 
the result of multiplying two numbers and then reducing modulo 6 yields the 
same answer as reducing them first and then multiplying the results modulo 
6. Example: the ordinary product of 10 and 11 is 110, which reduces modulo 
6 to 2; the reduced versions of 10 and 11 are 4 and 5, whose product modulo 
6 is 20 — 18 = 2. 
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Hint 8. The answer may or may not be easy to guess, but once it’s correctly 
guessed it’s easy to prove. The answer is yes. 


Hint 9. Not as a consequence but as a coincidence the answer is that the 
associative ones do and the others don’t. 


Hint 10. To find Ean multiply both numerator and denominator by 
a — ip. The old-fashioned name for the procedure is “rationalize the 


denominator”. 


Hint 11. The unit is (1,0). Caution: non-commutativity. 


Hint 12. The unit is a 


Hint 13. (a) and (b). Are the operations commutative? Are they associative? 
Do the answers change if R4 is replaced by (0, 1]? 
(c) Add (—2) to both sides of the assumed equation. 


Hint 14. An affine transformation E + ag + 8 with a = 0 has no inverse; 


a matrix with 
a p 
y ô 
ad — By = 0 has no inverse. 


The integers modulo 3 form an additive group, and so do the integers 
modulo anything else. Multiplication is subtler. Note: the number 6 is not a 
prime, but 7 is. 


Hint 15. If the underlying set has only two elements, then the answer is no. 
Hint 16. Use both distributive laws. 


Hint 17. In the proofs of the equations the distributive law must enter di- 
rectly or indirectly; if not there’s something wrong. The non-equations are 
different: one of them is true because that’s how language is used, and the 
other is not always true. 


Hint 18. Think about the integers modulo 5. 
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Hint 19. The answer is yes, but the proof is not obvious. One way to do it 
is by brute force; experiment with various possible ways of defining + and 
x, and don’t stop till the result is a field. 

A more intelligent and more illuminating way is to think about polyno- 
mials instead of integers. That is: study the set P of all polynomials with 
coefficients in a field of two elements, and “reduce” that set “modulo” some 
particular polynomial, the same way as the set Z of integers is reduced mod- 
ulo a prime number p to yield the field Z,. If the coefficient field is taken 
to be Q and the modulus is taken to be z? — 2, the result of the process is 
(except for notation) the field Q(./2). If the coefficient field is taken to be Z2 
and the modulus is taken to be an appropriately chosen polynomial of degree 
2, the result is a field with four elements. Similar techniques work for 8, 16, 
32, ... and 9, 27, 81,..., etc. 


Chapter 2. Vectors 


Hint 20. The 0 element of any additive group is characterized by the fact 
that 0 + 0 = 0. How can it happen that az = 0? Related question worth 
asking: how can it happen that ax = x? 


Hint 21. (1) The scalar distributive law; (2) the scalar identity law; (3) the 
vector distributive law; (4) none; (5) the associative law; (6) none. 


Hint 22. Can you solve two equations in two unknowns? 
Hint 23. (a): (1), (2), and (4); (b): (2) and (4). 


Hint 24. (a) Always. (b) In trivial cases only. Draw pictures. Don’t forget 
finite fields. If it were known that a vector space over an infinite field cannot 
be the union of any two of its proper subspaces, would it follow that it cannot 
be the union of any finite number? In any event: whenever Mı and Mo are 
subspaces, try to find a vector x in M; but not in Mz and a vector y not in 
Mh, and consider the line through y parallel to x. 


Hint 25. (a) Can it be done so that no vector in either set is a scalar 
multiple of a vector in the other set? (b) Can you solve three equations in 
three unknowns? 


Hint 26. Is it true that if x is a linear combination of y and something in 
M, then y is a linear combination of x and something in M? 
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Hint 27. (a) No; that’s easy. (b) Yes; that’s very easy. (c) No; and that takes 
some doing, or else previous acquaintance with the subject. (d) Yes; and all 
it requires is the definition, and minimum acquaintance with the concept of 
polynomial. 

The reader should be aware that the problem was phrased in incorrect 
but commonly accepted mathematese. Since “span” is a way to associate 
a subspace of V with each subset of V, the correct phrasing of (a) is: “is 
there a singleton that spans R??” Vectors alone, or even together with others 
(as in (b) and (c)), don’t span subspaces; spanning is done by sets of vec- 
tors. The colloquialism does no harm so long as its precise meaning is not 
forgotten. 


Hint 28. Note that since L N (L N N) = LON, the equation is a special 
case of the distributive law. The answer to the question is yes. The harder 
half to prove is that the left side is included in the right. Essential step: 
subtract. 

Hint 29. Look at pictures in R?. 


Hint 30. No. 


Hint 31. Just look for the correct term to transpose from one side of the 
given equation to the other. 


Hint 32. Use Problem 31. 


Chapter 3. Bases 


Hint 33. Examine the case in which E consists of a single vector. 


Hint 34. It is an elementary fact that if M is an m-dimensional subspace of 
an n-dimensional vector space V, then every complement of M has dimen- 
sion n — m. It follows that if several subspaces of V have a simultaneous 
complement, then they all have the same dimension. Problem 24 is relevant. 


Hint 35. (a) Irrational? (b) Zero? 


Hint 36. \/2? 
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Hint 37. (a) a— 8? (b) No room. Keep in mind that the natural coefficient 
field for C? is C. 


Hint 38. Why not? One way to answer (a) is to consider two independent 
vectors each of which is independent of (1,1). One way to answer (b) is to 
adjoin (0,0, 1,0) and (0,0,1,1) to the first two vectors and (1,0,0,0) and 
(1,1, 0,0) to the second two. 


Hint 39. (a) Too much room. (b) Are there any? 


Hint 40. How many vectors can there be in a maximal linearly independent 
set? 


Hint 41. What information about Y "°?! does a basis of V give? 
Hint 42. Can a basis for a proper subspace span the whole space? 


Hint 43. Use Problems 32 and 33. Don’t forget to worry about independence 
as well as totality. 


Hint 44. Given a subspace, look for an independent set in it that is as large 
as possible. 


Hint 45. Note that finite-dimensionality was not explicitly assumed. Recall 
that a possibly infinite set is called dependent exactly when it has a finite 
subset that is dependent. Contrariwise, a set is independent if every finite 
subset of it is independent. As for the answer, all it needs is the definitions 
of the two concepts that enter. 


Hint 46. It is tempting to apply a downward induction argument, possibly 
infinite. People who know about Zorn’s lemma might be tempted to use it, 
but the temptation is not likely to lead to a good result. A better way to settle 
the question is to use Problem 45. 


Hint 47. Omit one vector, express it as a linear combination of remaining 
vectors, and then omit a new vector different from all the ones used so far. 


Hint 48. A few seconds of geometric contemplation will reveal a relatively 
independent subset of R? consisting of 5 vectors (which is n +2 in this case). 
If, however, F is the field Zə of integers modulo 2, then a few seconds of 
computation will show that no relatively independent subset of F? can contain 
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more than 4 vectors. Why is that? What is the big difference between F and 
R that is at work here? 
Having thought about that question, proceed to use induction. 


Hint 49. A slightly modified question seems to be easier to approach: how 
many ordered bases are there? For the answer, consider one after another 
questions such as these: how many ways are there of picking the first vector 
of a basis?; once the first vector has been picked, how many ways are there 
of picking the second vector?; etc. 


Hint 50. The answer is n + m. 


Hint 51. There is a sense in which the required constructions are trivial: no 
matter what V is, let M be O and let N be V. In that case V/M is the same 
as V and V/N is a vector space with only one element, so that, except for 
notation, it is the same as the vector space O. If V was infinite-dimensional to 
begin with, then this construction provides trivial affirmative answers to both 
parts of the problem. Many non-trivial examples exist; to find one, consider 
the vector space P of polynomials (over, say, the field R of real numbers). 


Hint 52. The answer is n — m. Start with a basis of M, extend it to a basis 
of V, and use the result to construct a basis of V/M. 


Hint 53. If M and N are finite subsets of a set, what relation, if any, 
is always true among the numbers card M, card N, card(M U N), and 
card(M N N)? 


Chapter 4. Transformations 


Hint 54. Squaring scalars is harmless; trying to square vectors or their parts 
is what interferes with linearity. 


Hint 55. (1) Every linear functional except one has the same range. 

(2) Compare this change of variables with the one in Problem 54 
(1 (b)). 

(3) How many vectors does the range contain? 

(4) Compare this transformation with the squaring in Problem 54 
(2 (b)). 


(5) How does this weird vector space differ from R1? 
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Hint 56. (1) What do you know about a function if you know that its 
indefinite integral is identically 0? 

(2) What do you know about a function if you know that its derivative 
is identically 0? 

(3) Solve two “homogeneous” equations in two unknowns. ("Homoge- 
neous” means that the right sides are 0.) 

(4) When is a polynomial 0? 

(5) What happens to the coordinate axes? 

(6) This is an old friend. 


Hint 57. (1) What could possibly go wrong? 

(2) Neither transformation goes from R? to R3. 

(3) What happens when both S and T are applied to the constant poly- 
nomial 1? What about the polynomial x? 

(4) Do both products make sense? 

(5) What happens when both S and T are applied to the constant poly- 
nomial 1? What about the polynomial x? What about x°? 

(6) There is nothing to do but honest labor. 


Hint 58. Consider complements: for left divisibility, consider a complement 
of ker B, and for right divisibility consider a complement of ran B. 


Hint 59. (1) If the result of applying a linear transformation to each vector 
in a total set is known, then the entire linear transformation is known. 
(2) How many powers does A have? 
(3) What is A?x? 
Hint 60. Make heavy use of the linearity of T. 
Hint 61. (1) What is the kernel? (2) What is T?? (3) What is the range? 


Hint 62. Choose the entries (y) and (ô) closely related to the entries (a) 
and (8). 


Hint 63. Direct sum, equal rows, and similarity. 


Hint 64. The “conjecturable” answer is too modest; many of the 0’s below 
the diagonal can be replaced by 1’s without losing invertibility. 


Hint 65. Start with a basis of non-invertible elements and make them in- 
vertible. 
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Hint 66. To say that a linear transformation sends some independent set onto 
a dependent set is in effect the same as saying that it sends some non-zero 
vector onto 0. 


Hint 67. If the dimension is 2, then there is only one non-trivial permutation 
to consider. 


Hint 68. There is nothing to do but use the general formula for matrix 
multiplication. It might help to try the 2 x 2 case first. 


Hint 69. Look at diagonal matrices. 

Hint 70. Yes. 

Hint 71. Consider differentiation. 

Hint 72. If E is a projection, what is Æ??? 
Hint 73. Multiply them. 


Hint 74. No and no. Don’t forget to ask and answer some other natural 
questions in this neighborhood. 


Chapter 5. Duality 


Hint 75. If there were such a scalar, would it be uniquely determined by 
the prescribed linear functionals € and n? 


Hint 76. Use a basis of V to construct a basis of V’. 


Hint 77. This is very easy; just ask what information the hypothesis gives 
about the kernel of T. 


Hint 78. Does it help to assume that V is finite-dimensional? 


Hint 79. If V is R? and Mis the set of all vectors of the form (&, £2, €3, 0, 0), 
what is the annihilator of M? 


Hint 80. What are their dimensions? 
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Hint 81. Surely there is only one sanely guessable answer. 
Hint 82. Can kernels and ranges be used? 


Hint 83. There is no help for it: compute with subscripts. 


Chapter 6. Similarity 

Hint 84. How does one go from z to y? 

Hint 85. How does one go from ņ to €? 

Hint 86. Transform from the x’s to the y’s, as before. 


Hint 87. Use, once again, the transformation that takes one basis to the 
other, but this time in matrix form. 


Hint 88. If one of B and C is invertible, the answer is yes. 

Hint 89. Think about real and imaginary parts. That can solve the problem, 
but if elementary divisor theory is an accessible tool, think about it: the insight 
will be both less computational and more deep. 

Hint 90. Extend a basis of ker A to a basis of the whole space. 

Hint 91. Yes. 


Hint 92. Look first at the case in which Gy Æ 0. 


Hint 93. The first question should be about the relation between ranges and 
sums. 


Hint 94. The easy relation is between the rank of a product and the rank 
of its first factor; how can information about that be used to get information 
about the second factor? 


Hint 95. The best relation involves null A + null B. 


Hint 96. For numerical calculations the geometric definition of similarity is 
easier to use than the algebraic one. 
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Hint 97. There are two pairs of bases, and, consequently, it is reasonable 
to expect that two transformations will appear, one for each pair. 


Hint 98. Even though the focus is on the dimensions of ranges, it might be 
wise to begin by looking at the dimensions of kernels. 
Chapter 7. Canonical Forms 


Hint 99. If A is a linear transformation, is there a connection between the 
eigenvalues of A and A?? 


Hint 100. The answer is no. Can you tell by looking at a polynomial equa- 
tion what the sum and the product of the roots has to be? 


Hint 101. This is not easy. Reduce the problem to the consideration of 


A = 1, and then ask whether the classical infinite series formula for RE 
suggests anything. 


Hint 102. What about monomials? 

Hint 103. \° = 1. 

Hint 104. If is an eigenvalue of A, consider the polynomial p(X) — u. 
Hint 105. What are the eigenvalues? 

Hint 106. What does the assumption imply about eigenvectors? 

Hint 107. No. 


Hint 108. Look for the triangular forms that are nearest to diagonal ones 
—that is the ones for which as many as possible of the entries above the 
diagonal are equal to 0. 


Hint 109. Think about complex numbers. 


Hint 110. What can the blocks in a triangularization of A look like? 
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Hint 111. The answer depends on the dimension of the space and on the 
index of nilpotence; which plays the bigger role? 


Hint 112. The answer depends on size; look at matrices of size 2 and 
matrices of size 3. 


Hint 113. Examine what M does to a general vector (a, 6, y, ô, €) and then 
force the issue. 


Hint 114. Don’t get discouraged by minor setbacks. A possible approach is 
to focus on the case = 1, and use the power series expansion of y1 + À. 


Hint 115. Use eigenvalues—they are more interesting. Matrices, however, 
are quicker here. 


Hint 116. What turns out to be relevant is the Chinese Remainder The- 
orem. The version of that theorem in elementary number theory says that 
if £1,..., 2n are integers, pairwise relatively prime, and if y,..., Yn are 
arbitrary integers, then there exists an integer z such that 


XL; = yj mod z 


for j = 1,...,n. A more sophisticated algebraic version of the theorem 
has to do with sets of pairwise relatively prime ideals in arbitrary rings, 
which might not be commutative. The issue at hand is a special case of that 
algebraic theorem, but it can be proved directly. The ideals that enter are 
the annihilators (in the ring of all complex polynomials) of the given linear 
transformations. 


Chapter 8. Inner Product Spaces 


Hint 117. Form the inner product of a linear dependence relation with any 
one of its terms. 


Hint 118. Is there an expression for (x,y) in terms of x and y and norms 
—one that involves no inner products such as (u, v) with u 4 v? 


Hint 119. Examine both real and complex vector spaces. 


Hint 120. Always. 
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Hint 121. Keep enlarging. 
Hint 122. Evaluate the norms of linear combinations of x and y. 


Hint 123. How close does an arbitrary vector in V come to a linear combi- 
nation of an orthonormal basis for M? 


Hint 124. Look at kert €. 
Hint 125. (a) By definition (U*v, w) = (v, Uw); there is no help for it but 
to compute with that. (b) Yes. (c) Look at the image under U of the graph of 


A. 


Hint 126. Some of the answers are yes and some are no, but there is only 
one (namely (d)) that might cause some head scratching. 


Hint 127. Is something like polarization relevant? 
Hint 128. Always? 


Hint 129. Only (c) requires more than a brief moment’s thought; there are 
several cases to look at. 


Hint 130. Problem 127 is relevant. 

Hint 131. The easy ones are (a) and (b); the slightly less easy but straight- 
forward ones are (c) and (d). The only one that requires a little thought is (e); 
don’t forget that a must be real for the question to make sense. 

Hint 132. The answer is short, but a trick is needed. 

Hint 133. What is the adjoint of a perpendicular projection? 


Hint 134. A little computation never hurts. 


Hint 135. If £ < F and x isin ran F, evaluate ||v—F'x||.Ifran E C ran F, 
then EF = E. 


Hint 136. Is the product of two perpendicular projections always a perpen- 
dicular projection? 
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Hint 137. Quadratic forms are relevant. 


Hint 138. If Ax, = AL and Axe = A212, examine (zı, XQ). 


Chapter 9. Normality 


Hint 139. Is the dimension of the underlying vector space finite or infinite? 
Is U necessarily either injective or surjective? 


Hint 140. Can “unitary” be said in matrix language? 


Hint 141. The question is whether any of the three conditions implies any 
of the others, and whether any two imply the third. 


Hint 142. Must they be diagonal? 
Hint 143. Look at the eigenspaces corresponding to the distinct eigenvalues. 
Hint 144. Diagonalize. 


Hint 145. Assume the answer and think backward. The invertible case is 
easier. 


Hint 146. Imitate the Hermitian proof. 

Hint 147. Imitate the Hermitian proof. 

Hint 148. It’s a good idea to use the spectral theorem. 

Hint 149. Use Solution 148. 

Hint 150. Assuming that AS = SB, with both A and B normal, use the 
linear transformations A, B, and S, as entries in 2 x 2 matrices, so as to be 
able to apply the adjoint commutativity theorem. 


Hint 151. Put C = B(A* A) — (A*A)B and study the trace of C*C. 


Hint 152. (a) Consider triangular matrices. (b) Consider the Hamilton- 
Cayley equation. 
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Hint 153. Use square roots. 
Hint 154. The two answers are different. 


Hint 155. Some do and some don’t; the emphasis is on necessarily. For 
some of the ones that don’t, counterexamples of size 2 are not large enough. 


Hint 156. How many eigenvectors are there? More generally, how many 
invariant subspaces of dimension k are there, 0 < k < n? 


Hint 157. What’s the relation between A and the matrix 


0 0 1 
0 0 Of? 
0 0 0 


Hint 158. Consider the polar decomposition of a transformation that affects 
the similarity; a natural candidate for a unitary transformation that affects the 
equivalence is its unitary factor. Don’t be surprised if the argument wants to 
lean on the facts about adjoint intertwining (Solution 150). 


Hint 159. Most people find it difficult to make the right guess about this 
question when they first encounter it. The answer turns out to be no, but even 
knowing that does not make it easy to find a counterexample, and, having 
found one, to prove that it works. One counterexample is a 3 x 3 nilpotent 
matrix, and one way to prove that it works is to compute. 


Hint 160. Solution 89 describes a way of passing from complex similarity 
to real similarity, and Solution 158 shows how to go from (real or complex) 
similarity to (real or complex) unitary equivalence. The trouble is that Solution 
158 needs the adjoint intertwining theorem (Solution 150), which must assume 
that the given transformations are normal. Is the assumed unitary equivalence 
sufficiently stronger than similarity to imply at least a special case of the 
intertwining theorem that can be used here? 


Hint 161. Look at the Jordan form of A. 
Hint 162. Is a modification of the argument of Solution 161 usable? 


Hint 163. If A is nilpotent of index 2, examine subspaces of the form 
N + AN, where N is a subspace of ker A*. 
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Hint 164. The most obvious nilpotent transformation on C4 is the truncated 
shift (see Problem 156), but that has index 4. It’s tempting to look at its 
square, but that has index 2. What along these lines can be done to produce 
nilpotence of index 3? 


SOLUTIONS 


Chapter 1. Scalars 


Solution 1. 


The associative law for expressed in terms of + looks like this: 


2(2a + 28) + 2y = 2a + 2(26 + 24), 
which comes to 
4a +46 + 2y = 2a + 46447. (x) 
That can be true, but it doesn’t have to be; it is true if and only if a = y. If, 


for instance, a = 3 = 0 and y = 1, then the desired equation becomes the 
falsehood 


04+04+2=04+0+44. (xx) 
Conclusion: the associative law for is false. 


Comment. Does everyone agree that an alphabetical counterexample (such 
as (x)) is neither psychologically nor logically as convincing as a numerical 
one (such as (**))? 
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Solution 2. 
The associative law for is false. The equation 


a] e Hy = (ee) 4) Aly 
says that 
2a + (286 +7) =2(2a+ 8) +7, 


which is true if and only if a = 0. If, for instance, a = 1 and 8 = y = 0, 
then the desired equation becomes the falsehood 


2+ (0+0) = 2(2+0)+4+0. 


Solution 3. 


For both commutativity and associativity it is harder to find instances where 
they hold than instances where they don’t. Thus, for instance, 


(a) =a” 


is true only if a = 1, or y = 1, or 8 = y = 2. If, in particular, a = y = 
2 and 6 = 1, then it is false. Exponentiation is neither commutative nor 
associative. 


Solution 4. 


Both answers are yes, and one way to prove them is to compute. Since 


(7, ô) E (a, 8) = (ya — 68, y6 + da), 


the commutativity of [-] is a consequence of the commutativity of the ordinary 
multiplication of real numbers. 
The computation for associativity needs more symbols: 


((a, 8) E (7,6)) Efe, 9) 
= ((ay — Bd)e — (ad + By), (ay — B5)y + (ad + BY)e) 
and 


(a, 8) E ((7, 6) E (6e, ¢)) 
= (a(ye — õp) — blye + õe), alyp + ôe) + blye — ôe) Y- 
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By virtue of the associativity of the ordinary multiplication of real numbers 
the same eight triple products, with the same signs, occur in the right-hand 
sides of both these equations. 

For people who know about complex numbers and know that for them 
both addition and multiplication are both commutative and associative, Prob- 
lem 4 takes just as little work as the paragraph that introduced it. Indeed: 
if (a, 8) is thought of as a + 8i then and [] become “ordinary” 
complex addition and multiplication, and after that insight nothing remains to 
be done. 


Solution 5. 


Straightforward computation shows that the equation 


(a, B) O (1,9) = (956) E (a, 8) 


is a severe condition that is quite unlikely to be satisfied. An explicit coun- 
terexample is given by 


(1, 1) O (2, 1) oF (2, 2) 


and 


(2,1) FP) G, 1) = (2,3). 


The associativity story is quite different; there straightforward compu- 
tation shows that it is always true. This way of multiplying pairs of real 
numbers is not a weird invention; it arises in a natural classical context. An 
affine transformation of the real line is a mapping S defined for each real 
number £ by an equation of the form S(£) = a€ + 8, where a and 8 them- 
selves are fixed preassigned real numbers. If T is another such mapping, 
T(E) = yE + ô, then the composition ST (for the purist: S o T) is given by 


(STIE) = SOE + ô) = a(9€ + ô) +8 = (ay) + (að + p). 


In other words, the product ST of the transformations corresponding to 


(a, 6) and — (7,8) 


is exactly the transformation corresponding to (a, 3) F] (y, ô). Since the op- 
eration of composing transformations is always associative, the associativity 
of [] can be inferred with no further computation. 

Is that all right? Is the associativity of functional composition accepted? 
If it is not accepted, it can be proved as follows. Suppose that R, S, and T 
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are mappings of a set into itself, and write P = RS, Q = ST. Then, for each 
x in the domain, 


((RS)T) (x) =(PT)(x) = P(T(x)) [by the definition of PT] 
= (RS)(T(2)) = R(S(T(2))) [by the definition of RS]. 
whereas 
(R(ST)) (x) = (RQ)(x) = R(Q(z)) [by the definition of RQ] 
= R((ST)(x)) = R(S(T(2))) [by the definition of ST]. 


Since the last terms of these two chains of equations are equal, the first ones 
must be also. 


Solution 6. 


In view of the comment about Problem 5 being a special case, it follows 
immediately that the present [-] is not commutative. To get a counterexample, 
take any two pairs that do not commute for the [-] of Problem 5 and use 
each of them as the beginning of a quadruple whose last two coordinates are 
0 and 1. Concretely: 


(1, 1,0, 1) - (2,1, 0,1) = (2, 2, 0, 1) 
and 
(2,1,0,1)- (1,1, 0,1) = (2,3, 0, 1). 


Associativity is harder. It was true for Problem 5 and it might conceiv- 
ably have become false when the domain was enlarged for Problem 6. There 
is no help for it but to compute; the result is that the associative law is 
true here. 

For those who know about the associativity of multiplication for 2 x 2 
matrices no computation is necessary; just note that if a quadruple (a, B, y, 5) 


is written as 
a p 
y êj’ 


then the present product coincides with the ordinary matrix product. 


Solution 7. 


The worst way to solve the problem is to say that there are only 36 (six times 
six) possible ordered pairs and only 216 (six times six times six) possible 
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ordered triples that can be formed with 0, 1, 2, 3, 4, 5—in principle the 
commutativity and associativity questions can be decided by examining all of 
them. 

A better way, for commutativity for instance, is to note that if each of a 
and £ is one of the numbers 0, 1, 2, 3, 4, 5, and if the largest multiple of 6 that 
doesn’t exceed their ordinary product is, say, 60, so that a8 = y +60, where 
is one of the numbers 0, 1, 2, 3, 4, 5, then, because ordinary multiplication 
is commutative, the same conclusion holds for Ga. Consequence: 


af]e=y7 


and 


BEla=7. 


The reasoning to prove associativity works similarly—the language and the 
notation have to be chosen with care but there are no traps and no difficulties. 

The intellectually most rewarding way is to use the hint. If m and n are 
non-negative integers, then each of them is one of the numbers 0, 1, 2, 3, 4, 
5 plus a multiple of 6 (possibly the zero multiple). Establish some notation: 
say r(m) = a plus a multiple of 6, and r(n) = 8 plus a multiple of 6. (The 
reason for “r” is to be reminded of “reduce”.) Consequence: when mn and 
a are reduced modulo 6 they yield the same result. (Think about this step 
for a minute.) Conclusion: 


r(mn) = r(m) Q] r(n), 


as the hint promised. 

This was work, but it uses a standard technique in algebra (it’s called 
homomorphism and it will be studied systematically later), and it pays off. 
Suppose, for instance, that each of a, 3, and y is one of 0, 1, 2, 3, 4, 5, so 
that r(a) = a and r(Z) = 6, r(y) = y. The proof of the associative law can 
be arranged as follows: 


«0A y= ra gre) Bro) 
=r(aß) E] r(y) [by the preceding paragraph] 
=r((a8)7) [ditto] 
(a(Gy)) [because ordinary multiplication is associative] 
= (a) Hr) = r(o) 0 rA Br) 
=of] CH) 


—the last three equalities just unwind what the first three wound up. 
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An important difference between the modular arithmetic of 6 and 7 will 
become visible later, but for most of the theory they act the same way, and that 
is true, in particular, as far as commutativity and associativity are concerned. 


Solution 8. 


The answer may or may not be easy to guess, but once it’s correctly guessed 
it’s easy to prove. The answer is yes, and anyone who believes that and sets 
out to construct an example is bound to succeed. 

Call the three elements for which multiplication is to be defined a, £, 
and y; the problem is to construct a multiplication table that is commutative 
but not associative. 

Question: what does commutativity say about the table? Answer: sym- 
metry about the principal diagonal (top left to bottom right). That is: if the 
entry in row qa and column (3 is, say, y, then the entry in row 8 and column 
a must also be 7. 

How can associativity be avoided? How, for instance, can it be guaranteed 
that 


(ax B)xy#ax (Bx)? 


Possible approach: make a x 3 = y and @ x y = a; then the associative 
law will surely fail if y x y and a x a are different. That’s easy enough to 
achieve and the following table is one way to do it: 


Here, for what it’s worth, is a verbal description of this multiplication: 
the product of two distinct factors is the third element of the set, and the 
product of any element with itself is that element again. 

This is not the only possible solution of the problem, but it’s one that 
has an amusing relation to the double addition in Problem 1. Indeed, if the 
notation is changed so as to replace a by 0, 8 by 2, and y by 1, then the 
present x satisfies the equation 


ax B= 2a + 2p, 


where the plus sign on the right-hand side denotes addition modulo 3. 
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Solution 9. 


(1) How could a real number £ be an identity element for double addition? 
That is, can it be that 


2a+2e=a 


for all a? Clearly not: the equation holds only when a = —2e, so that, in 
particular, it does not hold when a = 1 and € = 0. 

(2) The answer is slightly different for half double addition. It is still true 
that for no £ does 


2ate=a 


hold for all a, but since this operation is not commutative at least a glance 
at the other order is called for. Could it be, for some €, that 


2e+a=a 


for all a? Sure: just put € = 0. That is: half double addition has no right 
identity element but it does have a left identity. 

(3) Exponentiation behaves similarly but backward. There is a right iden- 
tity, namely 1 (a! = a for all a), but there is no left identity («* = a for all 
a is impossible no matter what € is). 

(4) The ordered pair (1, 0) (or, if preferred, the complex number 1+0-2) is 
an identity for complex multiplication (both left and right, since multiplication 
is commutative). 

(5) The ordered pair (1,0) does the job again, but this time, since multi- 
plication is not commutative, the pertinent equations have to be checked both 
ways: 

(a, 8) x (1,0) = (a, 6) 
and 

(1,0) x (a, 6) = (a, p). 
Equivalently: the identity mapping J, defined by I(a) = a, is an affine 
transformation that is both a right and a left unit for functional composition. 
That is: if S is an affine transformation, then 


ToS=Sol=S. 


(6) The quadruple (1,0, 0, 1) is a unit for matrix multiplication (both left 
and right), or, if preferred, the identity matrix 


(04) 


is an identity element. 


10 


11 


192 LINEAR ALGEBRA PROBLEM BOOK 


Since complex multiplication and and affine multiplication are known to 
be special cases of matrix multiplication (see Problem 6), it should come as 
no surprise to learn that the identity elements described in (4) and (5) above 
are special cases of the one described in (6). 

(7) Modular addition and multiplication cause the least trouble: 0 does 
the job for +, and 1 does it for x. 


Solution 10. 


Given a and £, can one find y and ô so that the product of (a, 3) and (y, ô} is 
(1,0)? The problem reduces to the solution of two equations in the unknowns 
y and 0: 


ay— 86 =1, 
ad + By = 0. 


The standard elementary techniques for doing that yield an answer in every 
case, provided only that 


a? + 9? £0, 


or in other words (since œ and ( are real numbers) provided only that not 
both a and £ are 0. 
Alternatively: since in the customary complex notation 


1 a— i a bi 


a+ bi (at+fi(a—fpi) FP a? +8 


it follows that (a, 3) is invertible if and only if a? + 8? Æ 0, and, if that 
condition is satisfied, then 


E = 
(a, p) me (at Se) . 


Solution 11. 


The equations to be solved are almost trivial in this case. The problem is, 
given (a, 3), to find (y, 6) so that 


ay=1 and ad + 6 =0. 


The first equation has a solution if and only if a 4 0, and, if that is so, then 
the second equation is solvable also. Conclusion: (a, 3) is invertible if and 
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only if a Æ 0, and, if so, then 


Caution: this multiplication is not commutative, and the preceding com- 
putation guarantees a right inverse only. Does it work on the left too? Check 


1t: 
ERORE] 


Solution 12. 


It is time to abandon the quadruple notation and the symbol x; from now 


write 
a B 
y ô 


instead of (a, 8, y, ô) and indicate multiplication by juxtaposition (placing 
the two symbols next to one another) instead of by x. The problem is, given 


a matrix 
a 
y êj’ 


to determine whether or not there exists a matrix 


a’ Br 

y' 6’ 
a B av BPN (1 1) 
6 ô y jJ 0 1)’ 


What is asked for is a solution of four equations in four unknowns. The 
standard solution techniques are easy enough to apply, but they are, of course, 
rather boring. There is no help for it, for the present; an elegant general context 
into which all this fits will become visible only after some of the theory of 


such that 


ô 


linear algebra becomes known. The answer is that (e a is invertible if 


and only if aô — Gy Æ 0, and, if that is so, then 


1 I si ô -8 
CDEN $) 
A E a A 


12 


13 
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Readers reluctant to derive the result stand to gain something by at least 
checking it, that is by carrying out the two multiplications 


a B a’ B 
(° ae -) 
a B a B 
& pie a 


and noting that they yield the same answer, namely 
1 0 
0 iy” 


Comment. The present result applies, in particular, to the special matrices 


a 6 


} which are, except for notation, the same as the complex numbers 
Q 


and 


discussed in Problem 4. It follows that such a special matrix is invertible if 
and only if aa — B(—6) # 0—which is of course the same condition as 
a? + 8? Æ 0. (The awkward form is intended to serve as a reminder of how 
it arose this time.) If that condition is satisfied, then the inverse is 


B a 
a-+p2 F07 


and that is exactly the matrix that corresponds to the complex number 


a -6 
(ais sal 


a — 
a2 + B?? a2 + 82 2 
in perfect harmony with Solution 10. 


a 6 
0 1 
transformations (a, 3) discussed in Problem 5. According to the present result 
such a special matrix is invertible if and only if a - 1 — 8 - 0 Æ 0, and in that 
case the inverse is (+, —8), in perfect harmony with Solution 11. 

It is a consequence of these comments that not only is Problem 6 a 
generalization of Problems 4 and 5, but, correspondingly, Solution 12 has 


Solutions 10 and 11 as special cases. 


Similarly the special matrices are the same as the affine 


Solution 13. 


(a) The verification that min is both commutative and associative is straight- 
forward. If anything goes wrong, it must have to do with the existence of a 
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neutral element, an identity element, that plays the role of 0. The question is 
this: does there exist a positive real number z such that 


min(z, z) = x 


for every positive real number x? The equation demands that z be greater 
than or equal to every positive real number x«—in other words that z be “the 
largest real number”. That’s nonsense—there is no such thing; the present 
candidate fails to be a group. 

(b) The verification of commutativity and associativity is easy again. 
The search for 0 this time amounts to the search for a number z in the set 
{1, 2,3, 4, 5} with the property that 


max(x, z) = x 


for every number «x in the set. The equation demands that z be less than or 
equal to every positive integer between 1 and 5, and that’s easy; the number 
1 does the job. It remains to look for inverses. Given x, can we find y so that 
max(x, y) = 1? No—that’s impossible—the equation can never be satisfied 
unless x = y = 1. 

(c) Given that x + y = y, add (—y) to both sides of the equation. The 
right side becomes 0, and the left side becomes 


(£ +y) +(-y) =x + (y + (-y)) =2+0=2, 


and, consequently, x = 0. 


Comment. What went wrong in (a) was caused by the non-existence of a 
largest positive real number. What happens if R+ is replaced by a bounded 
set of positive real numbers, such as the closed unit interval [0, 1]? Does the 
operation min produce a group then? Commutativity, associativity, and the 
existence of a zero element are satisfied (the role of 0 being played by 1); 
the question is about inverses. Is it true that to every number x in [0, 1] there 
corresponds a number y in [0, 1] such that min(x, y) = 1? Certainly not; that 
can happen only if x = 1. 

Does the argument for (c) use the commutativity of +? Associativity? 
Both the defining properties of 0? 


Solution 14. 


The set of those affine transformations 


Emag +p 


14 


15 
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(discussed in Problem 5) for which a 4 0 does not have the first of the 
defining properties of abelian groups (commutativity), but it has all the others 
(the associative law, the existence of an identity element, and the existence of 
an inverse for every element)—see Problem 11; it is a group. 

The set of invertible 2 x 2 matrices is not commutative, but has the other 
properties of abelian groups (see Problem 12); it is a group. 

The product 2 x 3 is equal to 0 modulo 6. That is: multiplication mod- 
ulo 6 is not defined in the domain in question, or, in other words, the set 
{1, 2, 3,4, 5} is not closed under the operation. Conclusion: the non-zero in- 
tegers modulo 6 do not form a multiplicative group. 

If a is any one of the numbers 1, 2, 3, 4, 5, 6 what can be said about 
the numbers 


axl1,ax2,ax3,ax4,ax5,ax6 


(multiplication modulo 7)? First answer: none of them is 0 (modulo 7). (Why? 
This is important, and it requires a moment’s thought.) Second (as a conse- 
quence of the first): they are all different. (Why?) Third (as a consequence of 
the second): except possibly for the order in which they appear, they are the 
same as the numbers 1, 2, 3, 4, 5, 6, and therefore, in particular, one of them 
is 1. That is: for each number a there is a number 8 such that a x 8 = 1: this 
is exactly the assertion that every a has a multiplicative inverse. Conclusion: 
the non-zero integers modulo 7 form a multiplicative group. 


Solution 15. 


If there are only two distinct elements, an identity element 1 and another one, 
say a, then the “multiplication table” for the operation looks like 


0 1 a 


ala ? 


If the question mark is replaced by 1, the operation is associative; if it is 
replaced by a, then the element a has no inverse. Conclusion: two elements 
are not enough to provide a counterexample. 

If there are three distinct elements, an identity 1, and two others, a and 
6, then there is more elbow room, and, for instance, one possibility is 
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No matter what x and y are (among 1, a, and 8) the operation that the table 
defines has an identity and every element has an inverse. If x = 6 and y = a, 
the result is associative, so that it does not serve as an example of the sort of 
thing wanted. If, however, x = a, then 


(aa)B =aß=1 


and 
a(aZ) =al=a, 


so that the operation is not associative (and the same desired negative con- 
clusion follows if y = 8). 


Solution 16. 


Yes, everything is fine, multiplication in a field must be commutative, and, 
in particular, 0 - x = «-0 = 0 for every x, but it’s a good idea to look at 
the sort of thing that can go wrong if not both distributive laws are assumed. 
Question: if F is an abelian group with +, and if F* is an abelian group with 
x, and if the distributive law 


a(a+y)=ar+ay 
is true for all a, x and y, does it follow that multiplication in F is commuta- 
tive? Answer: no. Here is an artificial but illuminating example. 

Let F be the set of two integers 0 and 1 with addition defined modulo 2, 
and with multiplication defined so that x-0 = 0 for all x (that is, for x = 0 and 
for x = 1) and x-1 = 1 for all x. (Recall that in addition modulo 12 multiples 
of 12 are discarded; in addition modulo 2 multiples of 2 are discarded. The 
only thing peculiar about addition modulo 2 is that 1+ 1 = 0.) It is clear that 
F with + is an abelian group, and it is even clearer that F* (which consists 
of the single element 1) with x is an abelian group. The distributive law 


a(a+y) =ax+ay 


is true; to prove it, just examine the small finite number of possible cases. 
On the other hand the distributive law 


(a + Ba = az + bx 
is not true; indeed 
(0+1)-1=1 
and 


0-1+1.-1=1+1=0. 
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Irrelevant side remark: the associative law a(Gy) = (aß)y is true— 
straightforward verification. The commutative law is false, by definition: 0 - 
1=1and1-0=0. 

If, however, both distributive laws are assumed, in other words, if the 
system under consideration is a bona fide field, then all is well. Indeed, since 


(O+1l)a=0-r+1-2 
for all x, and since the left side of this equation is x whereas the right side is 
O-x+2, 
it follows (from Problem 1) that 
0-x=0 
for all x. A similar use of the other distributive law, 
x(0+1)=2-04+2-1, 
implies that 
z-0=0 


for all x. In other words, every product that contains 0 as a factor is equal 
to 0, and that implies everything that’s wanted, and it implies, in particular, 
that multiplication is both associative and commutative. 


Solution 17. 


(a) It is to be proved that 0 x a acts the way 0 does, so that what must be 
shown is that 0 x a added to any @ yields 8. It must in particular be true that 
(0 x a) +a =a (= 0 + a), and, in fact, that’s enough: if that is true then 
the additive cancellation law implies that 0 x a = 0. The proof therefore can 
be settled by the following steps: 


(0x a)+a=(0xa)+(1x a) (because 1 is the multiplicative unit) 
=(0+1)xa_ (by the distributive law) 
=1xa_ (because 0 is the additive unit) 
=a. 


(b) It is to be proved that (—1)a acts the way —a does, so that what 
must be shown is that a + (—1)a = 0. Proof: 


a+(-1)a = (1 x a) + ((-1) x a) = (14+ (-1)) xa =0xa=0. 
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(c) It helps to know “half” of the asserted equation, namely 


(—a)6 = —(a), 


and the other, similar, half 


a(—8) = - (ap). 
The first half is true because 
aß +(—a)B =(a+(—-a))8 (distributive law) 
=0x G=0, 
which shows that (—a)@ indeed acts just the way —(a@3) is supposed to. The 


other half is proved similarly. The proof of the main assertion is now an easy 
two step deduction: 


(—a)(—8) = —(a(—8)) = -(—(a8)) = ab. 
(d) This is not always true. Counterexample: integers modulo 2. (See 
Problem 18.) 
(e) By definition the non-zero elements of F constitute a multiplicative 
group, which says, in particular, that the product of two of them is again one 
of them. 


Solution 18. 


The answer is yes. The example illustrates the possible failure of the distribu- 
tive law and hence emphasizes the essential role of that law. 

Let F be {0,1, 2,3, 4}, with + being addition modulo 5 and xı being 
multiplication modulo 5. In this case all is well; (F, +, x1) is a field. 

An efficient way of defining a suitable x2 is by a multiplication table, 
as follows: 


A verbal description of the multiplication of the elements 2, 3, and 4 is this: 
the product of two distinct ones among them is the third. Compare Problem 
8. The distributive law does indeed break down: 


2x9 (344) =2x22=1, 
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but 


(2 x23) + (2 x24) =44+3=2. 


Comment. This is far from the only solution. To get another one, let F be 
{0,1} with + being addition and xı being multiplication modulo 2; in this 
case (F, +, x1) isa field. If, on the other hand, xə is defined by the ridiculous 
equation 


ax: ß=1 
for all a and 8, then 
1xə(1+1)=1 
but 


(1x21)+(1x21)=1+1=0. 


Solution 19. 


The answer is yes, there does exist a field with four elements, but the proof 
is not obvious. An intelligent and illuminating approach is to study the set P 
of all polynomials with coefficients in a field and “reduce” that set “modulo” 
some particular polynomial, the same way as the set Z of integers is reduced 
modulo a prime number p to yield the field Zp. 

Logically, the right coefficient field to start with for the purpose at hand 
is Z2, but to get used to the procedure it is wise to begin with a more familiar 
situation, which is not directly relevant. 

Let P be the set of all polynomials with coefficients in the field Q of 
rational numbers, and let p be the particular polynomial defined by 


pla) = z? — 2. 


Important observation: the polynomial p is irreducible. That means non- 
factorable, or, more precisely, it means that if p is the product of two poly- 
nomials with coefficients in Q, then one of them must be a constant. 

Let F be the result of “reducing P modulo p”. A quick way of explaining 
what that means is to say that the elements of F are the same as the elements 
of P (polynomials with rational coefficients), but the concept of equality is 
redefined: for present purposes two polynomials f and g are to be regarded 
as the same if they differ from one another only by a multiple of p. The 
customary symbol for “equality except possibly for a multiple of p” is =, and 
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the relation it denotes is called congruence. In more detail: to say that f is 
congruent to g modulo p, in symbols 


f = g modulo p, 
means that there exists a polynomial q (with rational coefficients) such that 
f -9 = pa. 


What happens to the “arithmetic” of polynomials when equality is interpreted 
modulo p? That is: what can be said about sums and products modulo p? 

As far as the addition of polynomials of degree 0 and degree 1 is con- 
cerned, nothing much happens: 


(ax + p) + (yx +6) = (a +7) + (8 + ò), 


just as it should be. When polynomials of degree 2 or more enter the picture, 
however, something new happens. Example: if 


f(t) = x? and g(x) = —2, 
then 
f(a) + g(x) = 0 (modulo p). 


Reason: f + g is a multiple of p (namely p + 1) and therefore 


(f +g) —0=0 modulo p. 


Once that is accepted, then even multiplication offers no new surprises. 
If, for instance, 


then 
f- g = 2 (modulo p); 


indeed, f - g — 2 = p. 

What does a polynomial look like, modulo p? Since x^ can always be 
replaced by 2 (is “equal” to 2), and, consequently, x’ (= 2x?) can be replaced 
by 2x, and zt (= 2- x?) can be replaced by 4, etc., it follows that every 
polynomial is “equal” to a polynomial of degree 0 or 1. Once that is agreed 
to, it follows with almost no pain that F is a field. Indeed, the verification that 
F with addition (modulo p) is an abelian group takes nothing but a modicum 
of careful thinking about the definitions. The same statement about the set 
of non-zero elements of F with multiplication (modulo p) takes a little more 
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thought: where do inverses come from? The clue to the answer is in the 
following computation: 


1 a— Px 


at Be a®—26?" 

Familiar? Of course it is: it is the same computation as the rationalization of 
the denominator that was needed to prove that Q(\/2) is a field. All the hard 
work is done; the distributive laws give no trouble, and the happy conclusion 
is that F is a field, and, in fact, except for notation it is the same as the field 
Q(V2). 

The same technique can be applied to many other coefficient fields and 
many other moduli. Consider, to be specific, the field Z2, and let P this time 
be the set of all polynomials 


2 
ao +01 + age" + +++ + an” 


of all possible degrees, with coefficients in Z2. (Caution: 52 + 3 means 


(atat+atata)+(14+14+1); 


it is a polynomial, and it is equal to x + 1 modulo 2. It is dangerous to jump 
to the conclusion that the polynomial x° + x3, which means rrrrx + zzz, 
can be reduced similarly.) The set P of all such polynomials is an abelian 
group with respect to addition (modulo 2, of course); thus, for example, the 
sum of 


et+etaetl 
and 


L+? +r 


r5 +r? +1. 
Polynomials admit a natural commutative multiplication also (example: 
(£? +1)z? +x = z? + x), 


with a unit (the constant polynomial 1), and addition and multiplication to- 
gether satisfy the distributive laws. Not all is well, however; multiplicative in- 
verses cause trouble. Example: there is no polynomial f such that z f(x) = 1; 
the polynomial x (different from 0) has no reciprocal. In this respect poly- 
nomials behave as the integers do: the reciprocal of an integer n is not an 
integer (unless n = 1 or n = —1). Just as for integers, reduction by a suitable 
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modulus can cure the disease. A pertinent modulus for the present problem 
isz*?+a4+1, 

Why is it pertinent? Because reduction modulo a polynomial of degree 
k, say, converts every polynomial into one of degree less than k, and modulo 
2 there are, for each k, exactly 2* polynomials of degree less than k. That’s 
clear, isn’t it?—to determine a polynomial of degree k — 1 or less, the number 
of coefficients that has to be specified is k, and there are two choices, namely 0 
and 1, for each coefficient. If we want to end up with exactly four polynomials 
that constitute a field with four elements, the value of k must therefore be 
2. Modulo 2 the four polynomials of degree less than 2 are 0, 1, x, and 
x + 1. Just as the modulus by which the integers must be reduced to get 
a field must be a prime—an unfactorable, irreducible number—the modulus 
by which the polynomials must be reduced here should be an unfactorable, 
irreducible polynomial. Modulo 2 there are exactly four polynomials of degree 
exactly 2, namely the result of adding one of 0, 1, x, or x + 1 to 2. Three 
of those, namely 


x?’ +1=(£+1)(x+1), 
and 
r? +r = x(x +1) 


are factorable; the only irreducible polynomial of degree 2 is x? + x +1. 
The reduced objects, the four polynomials 


0, 1, z, £+1 
are added (modulo 2) the obvious way; the modulus does not enter. It does 
enter into multiplication. Thus, for instance, to multiply modulo x? + x + 1, 
first multiply the usual obvious way and then throw away multiples of 
x? +g + 1. Example: x? = 1 (modulo x? + x +1). Reason: 
a = x(x?) =a((2?+24+1)+(x+1)) =a(¢ +1) 
=e +r = (z? +24+1)+1=1. 
The multiplication table looks like this: 
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The inspiration is now over; what remains is routine verification. The result 
is that with addition and multiplication as described the four polynomials 0, 
1, x, x + 1 do indeed form a field. 

To construct a field with nine elements, proceed similarly: use polyno- 
mials with coefficients in the field of integers modulo 3 and reduce modulo 
the polynomial z? + 2x + 2. 

Is there a field with six elements? The answer is no. The proof depends 
on a part of vector space theory that will be treated later, and the fact itself 
has no contact with the subject of this book. The general theorem is that the 
number of elements in a finite field is always a power of a prime, and that 
for every prime power there is one (and except for change of notation only 
one) finite field with that many elements. 


Chapter 2. Vectors 


Solution 20. 


The scalar zero law is a consequence of the other conditions; here is how the 
simple proof goes. If x is in V, then 


Ox +02 =(040)x (by the vector distributive law) 
= 02, 


and therefore, simply by cancellation in the additive group V, the forced 
conclusion is that Ox = 0. 

As for the vector zero law, the scalar distributive law implies that a0 is 
always zero. Indeed: 


a0 + a0 = a(0+ 0) = a0, 


and therefore, simply by cancellation in the additive group V, the forced 
conclusion is that a0 = 0. 

It is good to know that these two results about O are in a sense best 
possible. That is: if ax = 0, then either a = 0 or x = 0. Reason: if ax = 0 
and a # 0, then 


1 1 
CH1le= (42) r= (+) (ax) (by the associative law), 
a Q 


which implies that 
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Comment. If a scalar multiplication satisfies all the conditions in the defi- 
nition of a vector space, how likely is it that aa = x? That happens when 
x = 0 and it happens when a = 1; can it happen any other way? The answer 
is no, and, by now, the proof is easy: if az = x, then (a — 1)x = 0, and 
therefore either a — 1 = 0 or x = 0. 

A pertinent comment is that every field is a vector space over itself. Isn’t 
that obvious? All it says is that if, given F, and if the space V is defined to be 
F itself, with addition in V being what it was in F and scalar multiplication 
being ordinary multiplication in F, then the conditions in the definition of a 
vector space are automatically satisfied. Consequence: if F is a field, then the 
equation 0a = 0 in F is an instance of the scalar zero law. In other words, 
the solution of Problem 17 (a) is a special case of the present one. 


Solution 21. 


(1) The scalar distributive law fails: indeed 
2*1=27-1=4, 
but 
lxl4+1«*«1l=1-14+1-1=2. 


The verifications that all the other axioms of a vector space are satisfied are 
painless routine. 

(2) The scalar identity law fails; all other conditions are satisfied. 

(3) Since the mapping a > a? is multiplicative ((a3)? = a7), the 
associative law for the new scalar product is true (this should be checked, 
and it is fun to check). The new scalar identity law follows from the fact that 
1? = 1. The verification of the new scalar distributive law depends on the 
fact that if a and 8 are scalars (in the present sense, a very special case), then 


(a+ 6) =a? +8. 


(That identity holds, in fact, if and only if the field has “characteristic 2”, 
which means that a+a = 0 for every a in F. An equivalent way of expressing 
that condition is just to say that 2 = 0, where “2” means 1 + 1, of course.) 
The scalar distributive law, however, is false. Indeed: 


1((1, 0) + (0,1)) = 1(1,1) = (1,1), 
whereas 


1(1,0) + 1(0,1) = (1 + 1,0) + (0,1) = (1 +1,1). 
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(4) Nothing is missing; the definitions of and [-] do indeed make 
R+ into a real vector space. 
(5) In this example the associative law fails. Indeed, if a = @ = i, then 


(ap): 1 = (~1)1 = -1, 


whereas 
a-(G-1) =0- (0) =0. 


The verifications of the distributive laws (vector or scalar), and of the scalar 
identity law, are completely straightforward; all that they depend on (in ad- 
dition to the elementary properties of the addition of complex numbers) is 
that Re does the right thing with 0, 1, and +. (The right thing is Re0 = 0, 
Rel = 1, and Re(a+ 3) = Rea + Reb.) 

(6) Here, once more, nothing is missing. The result is a special case of 
the general observation that if F is a field and G is a subfield, then F is a 
vector space over G. 


Question. What is the status of the zero laws (scalar and vector) in these 
examples? The proof that they held (Problem 20) depended on the truth of 
the other conditions; does the failure of some of those conditions make the 
zero laws fail also? 


Comment. Examples (1), (2), (3), and (5) show that the definition of vec- 
tor spaces by four axioms contains no redundant information. A priori it is 
conceivable that some cleverly selected subset of those conditions (consist- 
ing of three, or two, or even only one) might be strong enough to imply the 
others. There are 15 non-empty subsets, and a detailed study of all possibili- 
ties threatens to be more than a little dull. An examination of some of those 
possibilities can, however, be helpful in coming to understand some of the 
subtleties of the algebra of scalars and vectors, and that’s what the examples 
(1), (2), (3), and (5) have provided. Each of them shows that some particular 
one of the four conditions is independent of the other three: they provide con- 
crete counterexamples (of F, V, and a scalar multiplication defined between 
them) in which three conditions hold and the fourth fails. 

Despite example (5), the associative law is almost a consequence of the 
others. If, to be specific, the underlying field is Q, and if V is a candidate for 
a vector space over Q, equipped with a scalar multiplication that satisfies the 
two distributive laws and the scalar identity law, then it satisfies all the other 
conditions, and, in particular, it satisfies the associative law also, so that V is 
an honest vector space over Q. The proof is not especially difficult, but it is 
of not much use in linear algebra; what follows is just a series of hints. 
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The first step might be to prove that 2x is necessarily equal to x + a, 
and that, more generally, for each positive integer m, the scalar product ma 
is the sum of m summands all equal to x. This much already guarantees that 
(aZ)x = a(Gx) whenever a and 8 are positive integers. To get the general 
associative law two more steps are necessary. One: recall that 0 - x = 0 and 
(—1)x = —a (compare the corresponding discussions of the status of the 
other vector space axioms) this yields the associative law for all integers. 
Two: 5a + Zu = x, and, more generally, the sum of n summands all equal 
to Ax is equal to «—this yields the associative law for all reciprocals of 
integers. Since every rational number has the form m - i, where m and n 
are integers, the associative law follows for all elements of Q. Caution: the 
reader who wishes to flesh out this skeletal outline should be quite sure that 
the lemmas needed (for example (—1)a = —x) can be proved without the 
use of the associative law. 

A similar argument can be used to show that if the underlying field 
is the field of integers modulo a prime p, then, again, the associative law 
is a consequence of the others. These facts indicate that for a proof of the 
independence of the associative law the field has to be more complicated than 
Q or Zp. (Reminder: fields such as Z, occurred in the discussion preceding 
Problem 19.) A field that is complicated enough is the field C of complex 
numbers—that’s what the counterexample (5) shows. 


Solution 22. 


It’s easy enough to verify that 


3(1, 1) — 1(1, 2) = (2,1) 
and 
—1(1,1)+1(1, 2) = (0, 1), 


so that (2,1) and (0,1) are indeed linear combinations of (1,1) and (1, 2), 
but these equations don’t reveal any secrets; the problem is where do they 
come from—how can they be discovered? 

The general question is this: for which vectors (a, 3) can real numbers 
€ and 7 be found so that 


€(1, 1) + (1, 2) = (a, 6)? 


In terms of coordinates this vector equation amounts to two numerical equa- 
tions: 


a 
E+ 2n = b. 
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To find the unknowns € and 7, subtract the top equation from the bottom one 
to get 
n= p — a, 


and then substitute the result back in the top equation to get 


€+B-—a=a, 


or, in other words, 


€=20-a. 


That’s where the unknown coefficients come from, and, once derived, the 
consequence is easy enough to check: 


(2a — 8)(1,1) + (8 — a)(1,2) = (2a — B+ 8 — a, 2a — B+ 28 — 2a) 
= (a, p). 


Conclusion: every vector in R? is a linear combination of (1, 1) and (1, 2). 

The process of solving two linear equations in two unknowns (eliminate 
one of the unknowns and then substitute) is itself a part of linear algebra. It 
is used here without any preliminary explanation because it is almost self- 
explanatory and most students learn it early. (Incidentally: in this context the 
phrase linear equations means equations of first degree, that is, typically, 
equations of the form 


ag + Bn+7=0 


in the two unknowns € and n.) 


Solution 23. 


For (a) the sets described by (1), (2), and (4) are subspaces and the sets 
described by (3), (5), and (6) are not. The proofs of the positive answers are 
straightforward applications of the definition; the negative answers deserve at 
least a brief second look. 

(3) The vector 0 (= (0,0,0)) does not satisfy the condition. 

(5) The vector (1,1,1) satisfies the condition, but its product by i 
(= V—1) does not. 

(6) The vector (1,1, 1) satisfies the condition, but its product by i does 
not. 

For (b) the sets described by (2) and (4) are subspaces and the sets 
described by (1) and (3) are not. The proofs of the positive answers are 
straightforward. For the negative answers: 
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(1) The polynomials x? + x and —x? + 2 satisfy the condition, but their 
sum does not. 

(3) The polynomial x 
(= V1) does not. 


2 satisfies the condition, but its product by i 


Comment. The answers (5) for (a) and (3) for (b) show that the sets M 
involved are not subspaces of the complex vector spaces involved—but what 
would happen if C? in (a) were replaced by R3, and, similarly, the complex 
vector space P in (b) were replaced by the corresponding real vector space? 
Answer: the results would stay the same (negative): just replace “i” by “—1”. 


Solution 24. 


(a) The intersection of any collection of subspaces is always a subspace. The 
proof is just a matter of language: it is contained in the meaning of the word 
“intersection”. Suppose, indeed, that the subspaces forming a collection are 
distinguished by the use of an index y; the problem is to prove that if each 
ML, is a subspace, then the same is true of M = f), M}. Since every My 
contains 0, so does M, and therefore M is not empty. If x and y belong to M 
(that is to every ML), then ax + Gy belongs to every ML, (no matter what a 
and 8 are), and therefore ax+ Gy belongs to M. Conclusion: M is a subspace. 

(b) If one of two given subspaces is the entire vector space V, then their 
union is V; the question is worth considering for proper subspaces only. If 
Mı and Mp are proper subspaces, can Mı U Mz be equal to V? No, never. If 
one of the subspaces includes the other, then their union is equal to the larger 
one, which is not equal to V. If neither includes the other, the reasoning is 
slightly more subtle; here is how it goes. 

Consider a vector x in M; that is not in Mg, and consider a vector y 
that is not in M (it doesn’t matter whether it is in M2 or not). The set of all 
scalar multiples of x, that is the set of all vectors of the form az, is a line 
through the origin. (The geometric language doesn’t have to be used, but it 
helps.) Translate that line by the vector y, that is, form the set of all vectors 
of the form aw +y; the result is a parallel line (not through the origin). Being 
parallel, the translated line has no vectors in common with Mı. (To see the 
geometry, draw a picture; to understand the algebra, write down a precise 
proof that ax + y can never be in Mı.) How many vectors can the translated 
line have in common with M2? Answer: at most one. Reason: if both ax + y 
and Bx +y are in Mo, with a Æ 2, then their difference (a — 3)x would be in 
Mb, and division by a— 6 would yield a contradiction. It is a consequence of 
these facts that the set L of all vectors of the form ax + y (a line) has at most 
one element in common with Mı U Mb. Since there are as many vectors in 


24 


210 LINEAR ALGEBRA PROBLEM BOOK 


L as there are scalars (and that means at least two), it follows that Mı U Mə 
cannot contain every vector in V. 

Granted that V cannot be the union of two proper subspaces, how about 
three? As an example of the sort of thing that can happen, consider the field 
F of integers modulo 2; the set F? of all ordered pairs of elements of F is a 
vector space in the usual way. The subset 


{(0,0), (0, 1)} 


is a subspace of F?, and so are the subsets 


{(0,0), (1,0)} 


and 


{(0,0), (1, 1)}. 


The set-theoretic union of these three subspaces is all of F?; this is an example 
of a vector space that is the union of three proper subspaces of itself. The 
example looks degenerate, in a sense: the vector space has only a finite number 
of vectors in it, and it should come as no surprise that it can be the union of 
a finite number of proper subspaces. Every vector space is the union of its 
“lines”, and in the cases under consideration there are only a finite number 
of them. 

Under these circumstances, the intelligent thing to do is to ask about 
infinite fields, and, sure enough, it turns out that a vector space over an 
infinite field is never the union of a finite number of proper subspaces; the 
proof is just a slight modification of the one that worked for n = 2 and all 
fields (infinite or not). 

Suppose, indeed, that Mı,..., Mn are proper subspaces such that 
none of them is included in the union of the others. From the present point of 
view that assumption involves no loss of generality; if one of them is included 
in the union of the others, just omit it, and note that the only effect of the 
omission is to reduce the number n to n — 1. It follows that there exists a 
vector xı in Mı that does not belong to Ml; for j 4 1, and (since My is not 
the whole space) there exists a vector xp that does not belong to M. 

Consider the line through xo parallel to xı. Precisely: let L be the set of 
all vectors of the form x + ax, (a a scalar). How large can the intersections 
LMM, be (where j = 1,...,n)? Since xı belongs to M; it follows that 
zo +azxı cannot belong to M; (for otherwise xo would also); this proves that 
LOM, = Ø. As for the sets LM M; with j Æ 1, they can contain no more 
than one vector each. Reason: if both zo + ax, and zo + Gx, belong to M, 
then so does their difference, (a — @)x1, and, since xı is not in M,, that can 
happen only when a = £2. 
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Since (by hypothesis) there are infinitely many scalars, the line L con- 
tains infinitely many vectors. Since, however, by the preceding paragraph, the 
number of elements in L N (Mı U --- U Mn) is less than n, it follows that 
M, U - - - U Mn cannot cover the whole space; the proof is complete. 

What the argument depends on is a comparison between the cardinal 
number of the ground field and a prescribed cardinal number n. Related 
theorems are true for certain related structures. One example: a group is never 
the union of two proper subgroups. Another example: a Banach space is never 
the union of a finite or countable collection of closed proper subspaces. 


Caution. Even if the ground field is uncountable (has cardinal number 
greater than No, as does R for instance), it is possible for a vector space 
to be the union of a countably infinite collection of proper subspaces. Exam- 
ple: the vector space P of all real polynomials is the union of the subspaces F,, 
consisting of all polynomials of degree less than or equal to n, n = 1,2,3,.... 


Solution 25. 


(a) Sure, that’s easy; just consider, for instance, the sets {(1,0), (0,1)} and 
{(2,0), (0,2)}. That answers the question, but it seems dishonest—could 
a positive answer have been obtained so that no vector in either set is a 
scalar multiple of a vector in the other set? Yes, and that’s easy too, but it 
requires a few more seconds of thought. One example is {(1, 0), (0, 1)} and 
{(1, 1), (1, —1)}. 

(b) The span of {(1, 1,1), (0,1,1), (0,0,1)} is R3, or, in other words, 
every vector in R is a linear combination of the three vectors in the set. 

Why? Because no matter what vector (a, 8, y) is prescribed, coefficients 
£, 7, and Ç can be found so that 


€(1, 1,1) + (0, 1, 1) + ¢(0, 0,1) = (a, 8, 7). 
In fact this one vector equation says the same thing as the three scalar equa- 
tions 


f=a, 
&E+n=®, 
Sarat het Sia 
and those are easy equations to solve. The solution is 
=a, 
Rt el et ae 
Casas yet a) aye 
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Check: 
a(1, 1, 1) at (8 = a)(0, 1, 1) i (y = (0, 0, 1) = (a, p, y). 


Comment. The span of the two vectors (0, 1, 1) and (0,0, 1) is the set of all 
(0,€,€ + 7), which is in fact the (7, ¢)-plane. The span of the two vectors 
(1,1, 1) and (0,1, 1) is the plane consisting of the set of all (€,€ +n, E+), 
and the span of (1, 1,1) and (0,0, 1) is still another plane. 


Solution 26. 


Yes, it follows. To say that x € \/{M, y} means that there exists a vector z 
in M and there exist scalars œ and ( such that 


x = ay + Bz. 
It follows, of course, that 

ay = x — pz, 
and, moreover, that a Æ 0—the latter because otherwise x would belong to 
M, contradicting the assumption. Conclusion: 


y € \/{M, z}, 
and that implies the equality of the spans of {M, z} and V{M, y}. 


Solution 27. 


(a) No, there is no vector that spans R?. Indeed, for each vector (x, y) in R’, 
its span is the set of all scalar multiples of it, and that can never contain every 
vector. Reason: if x = 0, then (1,0) is not a multiple of (x, y), and if « Æ 0, 
then (x,y + 1) is not a multiple of (x, y). 
(b) Yes, there are two vectors that span R?, many ways. One obvious 
example is (1,0) and (0, 1); another is (1, 1) and (1, —1)—see Problem 25. 
(c) No, no two vectors can span R. Suppose, indeed, that 


x = (£1, £2, £3) and y = (Y1, Y2, Y3) 


are any two vectors in R3; the question is whether for an arbitrary z = 
(21, Z2, 23) coefficients œ and 8 can be found so that ax + By = z. In other 
words, for given (x1, £2, £3) and (y1, ye, y3) can the equations 


azı + yi = 21, 


azz + By2 = 22, 


ax3 + By3 = 23, 
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be solved for the unknowns a and 8, no matter what z1, z2, and z3 are? 
The negative answer can be proved either by patiently waiting till the present 
discussion of linear algebra reaches the pertinent discussion of dimension 
theory, or by making use of known facts about the solution of three equations 
in two unknowns (which belongs to the more general context of systems 
with more equations than unknowns). In geometric language the facts can be 
expressed by saying that all linear combinations of x and y are contained in 
a single plane. 

(d) No, no finite set of vectors spans the vector space P of all polyno- 
mials (no matter what the underlying coefficient field is). The reason is that 
polynomials have degrees. In a finite set of polynomials there is one with 
maximum degree; no linear combination of the set will produce a polynomial 
with greater degree than that. Since P contains polynomials of all degrees, 
the span of the finite set cannot exhaust P. Compare the cautionary comment 
at the end of Solution 24. 


Solution 28. 


The modular identity does hold for subspaces. 

The easy direction is D: the right side is included in the left. Reason: 
LOM C L (obviously) and LDN c M+ (LAN). In other words, both 
summands on the right are included in the left, and, therefore, so is their sum. 

The reverse direction takes a little more insight. If x is a vector in the left 
side, then x € Land x = y + z with y € M and z € LAN. Since y = x — z, 
and since —z belongs to L N N along with z, so that, in particular, —z € L, 
it follows that y € L. Since by the choice of notation, y € M, it follows that 
y € LOM, and hence that 


x € (LAM)+ (LAN), 
as promised. 


Solution 29. 


The question is when do addition and intersection satisfy the distributive law. 
Half the answer is obvious: the right side is included in the left. Reason: both 
LAM and LAN are included in both L and M + N. 

As for the other half, if every vector in V is a scalar multiple of a 
particular vector x, then V has very few subspaces—in fact, only two, O and 
V. In that case the distributive law for subspaces is obviously true; in all other 
cases it’s false. 

Suppose, indeed, that V contains two vectors x and y such that neither 
one is a scalar multiple of the other. (Look at a picture in R?.) If L, M, and 


28 


29 


30 


31 


32 


214 LINEAR ALGEBRA PROBLEM BOOK 


N are the sets of all scalar multiples of x + y, x, and y, respectively, then 
LAM and LAN are O, so that the right side is O, whereas M + N includes 
L, so that the left side is L. 


Solution 30. 


For most total sets E in a vector space V it is easy to find a subspace M that 
has nothing in common with E. For a specific example, let V be R? and let 
3 be {(1, 0), (0, 1)}; the subspace M spanned by (1, 1) is disjoint from E. 


Solution 31. 
The answers are yes, and the proofs are easy. 

If zo = Doa a,;x;, then put ao = —1 and note that Da ajc; = 0. 
Since not all the scalars ao, &œ1,...,@n are 0 (because at least ag is not), it 
follows that the enlarged set {£0, £1, . . ., £n} is dependent. 


In the converse direction, if Xai a;x; = 0, with not every a; equal 
to 0, then there is at least one index 7 such that a; 4 0. Solve for x; to get 
w=); ias. (The symbol }7,,; indicates the sum extended over the 
indices j different from i.) That’s it: the last equation says that x; is a linear 
combination of the other x’s. 

It is sometimes convenient to regard a finite set {x£0, £1, . . -, £n } of vec- 
tors as presented in order, the order of indices, and then to ask about the 
dependence of the initial segments {xo}, {£0, £1}, {£0, £1, £2}, etc. The 
proof given above yields the appropriate result. A more explicit statement 
is this corollary: a set {29,21,...,2n} of non-zero vectors is dependent if 
and only if at least one of the vectors x21,..., £n is a linear combination of 
the preceding ones. The important word is “preceding”. The proof of “if” is 
trivial. The proof of “only if’ is obtained from the second half of the proof 
given above by choosing 2; to be the first vector after x9 for which the set 
{x1,...,2;} is linearly dependent. (Caution: is it certain that there is such 
an x;?) The desired result is obtained by solving such a linear dependence 
relation for x;. 


Solution 32. 


Yes, every finite-dimensional vector space has a finite basis; in fact, if E is a 
finite total set for V, then there exists an independent subset F of E that is a 
basis for V. The trick is to use Problem 31. 

If V = O, the result is trivial; there is no loss of generality in assuming 
that V Æ O. In that case suppose that E is a finite total set for V and begin by 
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asking whether 0 belongs to E. If it does, discard it; the resulting set (which 
might as well be denoted by E again) is still total for V. If E is independent, 
there is nothing to do; in that case F = E. If E is dependent, then, by Problem 
31, there exists an element of E that is a linear combination of the others. 
Discard that element, and note that the resulting set (which might as well be 
denoted by E again) is still total for V. Keep repeating the argument of the 
preceding two sentences as long as necessary; since E is finite, the repetitions 
have to stop in a finite number of steps. The only thing that can stop them is 
arrival at an independent set, and that completes the proof. 


Chapter 3. Bases 


Solution 33. 


If T is a total set for a vector space V, and E is a finite independent set in 
V, then there exists a subset F of T, with the same number of elements as E 
such that (T — F) UE is total. 

The proof is simplest in case E consists of a single non-zero vector zx. 
All that has to be done then is to express x as a linear combination X` Oiti 
of vectors in T and find a coefficient a; different from 0. From z = >> j OY) 


it follows that y; = + (x — Do jzi@jyj). If yi is discarded from T and 
replaced by z, the result is just as total as it was before, because each linear 
combination of vectors in T is equal to a linear combination of x and of 
vectors in T different from y;. 

In the general case, E = {x1, . . . , £n }, apply the result of the preceding 
paragraph inductively to one x at a time. Begin, that is, by finding y in Tı 
(= T) so that Tə = (Tı — {yi }) U {a1} is total. For the second step, find y2 
in T2 so that Tz = (T2 — {y2}) U {x2} is total, and take an additional minute 
to become convinced that Ts contains xı, that is that y2 couldn’t have been 
xı. The reason for the latter is the assumed independence of the x’s; if x1 
had been discarded from T2, no linear combination of x2, together with the 
vectors that have not been discarded, could recapture it. Keep going the same 
way, forming 


Ty +1 = (Tr — {yn}) U {ee}, 


till T,, is reached. The result is a new total set obtained from T by changing 
a subset F = {y1,..., Yn} of T into the prescribed set 


J ea er t 


The name of the result is the Steinitz exchange theorem. 
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The result has three useful corollaries. 


Corollary 1. Jf E is an independent set and T is a total set in a finite- 
dimensional vector space, then the number of elements in E is less than or 
equal to the number of elements in T. 


Corollary 2. Any two bases for a finite-dimensional vector space have the 
same number of elements. 


The dimension of a finite-dimensional vector space V, abbreviated dim V, 
is the number of elements in a basis of V. 


Corollary 3. Every set of more than n vectors in a vector space Y of 
dimension n is dependent. A set of n vectors in V is a basis if and only if it 
is independent, or, alternatively, if and only if it is total. 


Note that these considerations answer, in particular, a question asked 
long before (Problem 27), namely whether two vectors can span R. Since 
dim R? = 3, the answer is no. 


Solution 34. 


If several subspaces of a space V of dimension n have a simultaneous com- 
plement, then they all have the same dimension, say m, so that that is at 
least a necessary condition. Assertion: if the coefficient field is infinite, then 
that condition is sufficient also: finite collections of subspaces of the same 
dimension m necessarily have simultaneous complements. 

If the common dimension m is equal to n, then each of the given sub- 
spaces is equal to Y (is it fair in that case to speak of “several” subspaces’), 
and the subspace {0} is a simultaneous complement—a thoroughly uninter- 
esting degenerate case. If m < n, then the given subspaces M4, ..., Mp are 
proper, and it follows from Problem 24 that there exists a vector x in V that 
doesn’t belong to any of them. If L is the 1-dimensional space spanned by x, 
then M; N L = {0} for each j, and, moreover, all the subspaces 


Mı +L,...,M, +L 


have dimension m + 1. Either m + 1 = n (in which case M; + L = V 
for each j, and, in fact, L is a simultaneous complement of all the M,;’s), 
or m+ 1 < n, in which case the reasoning can be applied again. Applying 
it inductively a total of n — m times produces the promised simultaneous 
complement. 
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The generalization of Problem 24 to uncountable ground fields and count- 
able collections of proper subspaces is just as easy to apply as the ungener- 
alized version. Conclusion: if the ground field is uncountable, then countable 
collections of subspaces of the same dimension m necessarily have simulta- 
neous complements. 


Solution 35. 


(a) If x and 1 are linearly dependent, then there exist rational numbers a 
and 6, not both 0, such that a- 1+ 8- € = 0. The coefficient 8 cannot 
be O (for if it were, than a too would have to be), and, consequently, this 
dependence relation implies that x = =S: and hence that x is rational. The 
reverse implication is equally easy: x and 1 are linearly dependent if and only 
if x is rational. 

(b) The solution of two equations in two unknowns is involved, namely 
the equations 


a(l+€)+6(1—€) =0 
a(l =<) +6(1+€) =0 


in the unknowns a and Ø. If £ Æ 0, then a and 8 must be 0; the only case 
of linear dependence is the trivial one, (1,1) and (1, 1). 


Solution 36. 


How about (x, 1,0), (1,2, 1), and (0, 1, x)? The assumption of linear depen- 
dence leads to three equations in three unknowns that form a conspiracy: they 
imply that (2? — 2) = 0. Consequence: x must be 0 or else +2, and, 
indeed, in each of those cases, linear dependence does take place. That makes 
sense for R, but not for Q; in that case linear dependence can take place only 
when x = 0. 


Solution 37. 


(a) If (1,a) and (1, 8) are to be linearly independent, then clearly a cannot 
be equal to 8, and, conversely, if a Æ (3, then linear independence does take 
place. 

(b) No, there is not enough room in C? for three linearly independent 
vectors; the trouble is that three equations in two unknowns are quite likely to 
have a non-trivial solution. Better: C? has dimension 2, and the existence of 
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three linearly independent vectors would imply that the dimension is at least 
3. 


Solution 38. 


Why not? 

For (a) consider, for instance, two independent vectors in C?, such as 
(1,0) and (1,—1), each of which is independent of (1,1), and use them to 
doctor up the two given vectors. One possibility is to adjoin 


(0, 0, 1, 0) and (1, 0, 0, 0) 
to the first given pair and adjoin 
(0,0, 1, —1) and (1, -1, 0,0) 


to the second given pair. 
For (b), adjoin 


(0, 0, 1, 0) and (0,0, 1,1) 
to the first two vectors and adjoin 
(—1,1,0,0,) and (0, 1, 0, 0) 


to the second two. 


Solution 39. 


(a) Never—there is too much room in C?. Better: since the dimension of C? 
is 3, two vectors can never constitute a basis in it. 

(b) Never—the sum of the first two is the third—they are linearly depen- 
dent. 


Solution 40. 


How many vectors can there be in a maximal linearly independent set? Clearly 
not more than 4, and it doesn’t take much work to realize that any four of 
the six prescribed vectors are linearly independent. Conclusion: the answer is 
the number of 4-element subsets of a 6-element set, that is ($). 
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Solution 41. 


If x is an arbitrary non-zero vector in Y, then x and ix (= —12) are linearly 
independent over R. (Reason: if a and 8 are real numbers and if 


ax + Bix) = 0, 
then 

(a + Bi)x = 0, 
and since x 4 0, it follows that a + Gi = 0.) Consequence: if the vec- 
tors £1, £2, %3, ... constitute a basis in V, then the same vectors, together 


with their multiples by i, constitute a basis in V *°*!, Conclusion: the “real 
dimension” of V is 2n. Unsurprising corollary: the real dimension of C 
is 2. 


Solution 42. 


Suppose, more generally, that M and N are finite-dimensional subspaces of a 
vector space, with M C N. If M Æ N, then a basis for M cannot span N. Take 
a basis for M and adjoin to it a vector in N that is not in M. The result is a 
linearly independent set in N containing more elements than the dimension of 
M—which implies that M and N do not have the same dimension. Conclusion: 
if a subspace of N has the same dimension as N, then it must be equal to N. 


Solution 43. 


The answer is yes; every finite independent set in a finite-dimensional vector 
space can be extended to a basis. The assertion (Problem 32) that in a finite- 
dimensional vector space there always exists a finite basis is a special case: 
it just says that the empty set (which is independent) can be extended to a 
basis. 

The proof of the general answer has only one small trap. Given a finite 
independent set E, consider an arbitrary finite basis B, and apply the Steinitz 
exchange theorem (see Solution 33). The result is that there exists a total set 
that includes E and has the same number of elements as B; but is it obvious 
that that set must be independent? Yes, it is obvious. If it were dependent, 
then (see Problem 32) a proper subset of it would be a basis, contradicting the 
fact (Corollary 2 in Solution 33) that any two bases have the same number 
of elements. 

Note that the result answers the sample question about the set {u, v} 
described before the statement of the problem: there does indeed exist a basis 
of C4 containing u and v. One such basis is {u, v, £1, £2}. 
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Solution 44. 


If V is a vector space of dimension n, say, and if M is a subspace of V, 
then M is indeed finite-dimensional, and, in fact, the dimension of M must 
be less than or equal to n. If M = O, then the dimension of M is 0, and 
the proof is complete. If M contains a non-zero vector xı, let Mı (C M) be 
the subspace spanned by xı. If M = M), then M has dimension 1, and the 
proof is complete. If M # Mh, let x2 be an element of M not contained in 
Mı, and let Mə be the subspace spanned by xı and x2; and so on. After no 
more than n steps the process reaches an end. Reason: the process yields an 
independent set, and no such set can have more than n elements (since every 
independent set can be extended to a basis, and no basis can have more than 
n elements). The only way the process can reach an end is by having the «’s 
form a set that spans M—and the proof is complete. 


Solution 45. 


A total set is minimal if and only if it is independent. The most natural 

way to approach the proofs of the two implications involved seems to be by 

contrapositives. That is: E is not minimal if and only if it is dependent. 
Suppose, indeed, that E is not minimal, which means that E has a non- 


empty subset F such that the relative complement E — F is total. If x is any 
vector in F, then there exist vectors 71, ..., £n in E—F and there exist scalars 
Q1,- --, Œn such that 


n 
Tt = y Qjtj, 
J=1 


which implies, of course, that the subset {x,21,...,@n} of E is dependent. 
If, in reverse, E is dependent, then there exist vectors x1,...,%,, in 
and there exist scalars a 1,...,@n not all zero such that 


n 
J QjTj =0. 
j=1 


Find 7 so that a; # 0, and note that 


This implies that the set F = E — {x;} is just as total as E, and hence that E 
is not minimal. 
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Solution 46. 


If E is a total subset of a finite-dimensional vector space V, express each 
vector in a basis of V as a linear combination of vectors in E. The vectors 
actually used in all these linear combinations form a finite total subset of E. 
That subset has an independent subsubset with the same span (see Problem 
33), and, therefore, that subsubset is total. Since an independent total set is 
minimal, the reasoning proves the existence of a minimal total subset of E. 

The conclusion remains true for spaces that are not finite-dimensional, 
but at least a part of the technique has to be different. What’s needed, given 
“, is an independent subset of E with the same span. A quick way to get one 
is to consider the set of all independent subsets of E and to find among them 
a maximal one. (That’s the same technique as is used to prove the existence 
of bases.) The span of such a maximal independent subset of E has to be the 
same as the span of E (for any smaller span would contradict maximality). 
Since the span of E is V, that maximal independent subset is itself total. 
Since an independent total set is a minimal total set (Problem 45), the proof 
is complete: every total set has minimal total subset. 


Solution 47. 


An infinitely total set E always has an infinite subset F such that E — F is 
total. Here is one way to construct an F. 

Consider an arbitrary vector x; in E. Since, by assumption, E — {x1} 
is total, there exists a finite subset E, of E — {x1} whose span contains z1. 
Let x2 be a vector in the relative complement E — ({21} U E1). Since, by 
assumption, E — ({x1, £2} U E1) is total, it has a finite subset Ez whose span 
contains x2. Keep iterating the procedure. That is, at the next step, let x3 be 
a vector in 


i — ({#1, 22} U 1 U %2), 


note that that relative complement is total, and that, therefore, it has a finite 
subset Ez whose span contains x3. The result of this iterative procedure is an 
infinite set F = {x 1, £2, £3, . . .} with the property that E—F is total. Reason: 
i; is a subset of E — F for each j, and therefore x; belongs to the span of 
2 — F for each j. 


Solution 48. 
Assertion: if {x1,. . ., £g} is a relatively independent subset of R” , where k 2 
n, then there exists a vector 2,4, such that {x1,..., £k, x41} is relatively 


independent. 
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For the proof, form all subsets of n — 1 vectors of {21,...,2,}, and, for 
each such subset, form the subspace they span. (Note that the dimension of 
each of those subspaces is exactly n — 1, not less. The reason is the assumed 
relative independence. This fact is not needed in the proof, but it’s good to 
know anyway.) The construction results in a finite number of subspaces that, 
between them, certainly do not exhaust R”; choose x, to be any vector 
that does not belong to any of them. (The property of the field R that this 
argument depends on is that R is infinite.) 

Why is the enlarged set relatively independent? To see that, suppose that 
Y1,-++;Yn—1 are any n — 1 distinct vectors of the set {x1,...,2,}. In a non- 
trivial dependence relation connecting the y’s and x41, that is, in a relation 
of the form 


5 Biyi + Oe = 0, 
P 


the coefficient œ cannot be 0 (for otherwise the y’s would be dependent). 
Any such non-trivial dependence would, therefore, imply that x,41 belongs 
to the span of the y’s, which contradicts the way that 7,41 was chosen. This 
completes the proof of the assertion. 

Inductive iteration of the assertion (starting with an independent set of 
n vectors) yields a relatively independent set {21, x2, £3, . . .} with infinitely 
many elements. 

A student familiar with cardinal numbers might still be unsatisfied. The 
argument proves, to be sure, that there is no finite upper bound to the possi- 
ble sizes of relatively independent sets, but it doesn’t completely answer the 
original question. Could it be, one can go on to ask, that there exist relatively 
independent sets with uncountably many elements? The answer is yes, but its 
proof seems to demand transfinite techniques (such as Zorn’s lemma). 


Solution 49. 


Let q be the number of elements in the coefficient field F and let n be the 
dimension of the given vector space over F. Since a basis of F° is a set of 
exactly n independent n-tuples of elements of F, the question is (or might as 
well be): how many independent sets of exactly n vectors in F” are there? 
Any non-zero n-tuple can be the first element of a basis; pick one, and 
call it xı. Since the number of vectors in F” is q”, and since only the zero 
vector is to be avoided, the number of possible choices at this stage is ¢’ — 1. 
Any n-tuple that is not a scalar multiple of xı can follow xı as the second 
element of a basis; pick one and call it x2. Since the number of vectors in F” 
is q”, and since only the scalar multiples of xı are to be avoided, the number 
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of possible choices at this stage is q” — q. (Note that the number of scalar 
multiples of xı is the same as the number of scalars, and that is q.) The next 
step in this inductive process is typical of the most general step. Any n-tuple 
that is not a linear combination of x; and zə can follow x, and x2 as the 
third element of a basis; pick one and call it x3. Since the number of vectors 
in F” is q”, and since only the linear combinations of x, and x2 are to be 
avoided, the number of possible choices at this stage is q? — q?. (The number 
of linear combinations of two independent vectors is the number of the set 
of all pairs of scalars, and that is q?.) Keep going the same way a total of n 
times altogether; the final answer is the product 


(a — 1)(@" — a) — 97): (@" — a") 
of the partial answers obtained along the way. 

Caution: this product is not the number of bases, but the number of 
ordered bases, the ones in which a basis obtained by permuting the vectors of 
one already at hand is considered different from the original one. (Emphasis: 
the permutations here referred to are not permutations of coordinates in an 
n-tuple, but permutations of the vectors in a basis.) To get the number of 
honest (unordered) bases, divide the answer by n!. 

A curious subtlety arises in this kind of counting. If F = Z2, and the 
formula just derived is applied to F? (that is, q = 2 and n = 3), it yields 


(8 — 1)(8— 2)(8— 4) 


ordered bases, and, therefore, 28 unordered ones. Related question: how many 
bases for R? are there in which each vector (ordered triple of real numbers) 
is permitted to have the coordinates 0 and 1 only? A not too laborious count 
yields the answer 29. What accounts for the difference? Answer: the set 
{(0,1,1), (1,0, 1), (1,1, 0)} is a basis for R*, but the same symbols inter- 
preted modulo 2 describe a subset of F° that is not a basis. (Why not?) 


Solution 50. 


The wording of the question suggests that the direct sum of two finite- 
dimensional vector spaces is finite-dimensional. That is true, and the best 
way to prove it is to use bases of the given vector spaces to construct a basis 
of their direct sum. 

If {£1,..., £n} and {y1,...,Y%m} are bases for U and Y respectively, 
then it seems natural to look at the set B of vectors 


(z1, 0), seg (En, 0), (0, yı), cawg (0, Ym), 
and try to prove that it is a basis for U@ V. 
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The easiest thing to see is that B spans U@ V. Indeed, since every vector 
x in U is a linear combination of the x;’s, it follows that every vector of the 
form (x, 0) in U@ V is a linear combination of the (x;, 0)’s. Similarly, every 
vector of the form (0, y) is a linear combination of the (0, y;)’s, and those 
two conclusions together imply that every vector (x,y) in U @ Y is a linear 
combination of the vectors in B. 

Is it possible that the set B is dependent? If 


aı(zı, 0) EVEA An(Ln, 0) a 21 (0, yı) See ee Bm (0, Ym) = (0, 0), 


then 


> aie, >| 823 | = 0,0), 
i j 


and it follows from the independence of the x;’s and of the y;’s that 
Q1 =- = Qn = b1 =- -= By = 0, and the proof is complete. 


Solution 51. 


(a) Let the role of V be played by the vector space P of all real polynomials, 
and let M be the subspace of all even polynomials (see Problem 25). When are 
two polynomials equal (congruent) modulo M? Answer: when their difference 
is even. When, in particular, is a polynomial equal to 0 modulo M? Answer: 
when it is even. Consequence: if p(x) = x?"*1, for n = 0,1,2,..., then 
a non-trivial linear combination of a finite set of these p,,’s can never be 0 
modulo M. Reason: in any linear combination of them, let k be the largest 
index for which the coefficient of p is not 0, and note that in that case 
the degree of the linear combination will be 2k + 1 (which is not even). 
Conclusion: the quotient space V/M has an infinite independent subset, which 
implies, of course, that it is not finite-dimensional. 

(b) If, on the other hand, N is the subspace of all polynomials p for 
which p(0) = 0 (the constant term is 0), then the equality of two polynomials 
modulo N simply means that they have the same constant term. Consequence: 
every polynomial is congruent modulo N to a scalar multiple of the constant 
polynomial 1, which implies that the dimension of V/M is 1. If bigger ex- 
amples are wanted, just make N smaller. To be specific, let N be the set of 
all those polynomials in which not only the constant term is required to be 
0, but the coefficients of the powers x, x?, and z? are required to be 0 also. 
Consequence: every polynomial is congruent modulo N to a polynomial of 
degree 3 at most, which implies that the dimension of V/M is 4. 
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Solution 52. 


If M is an m-dimensional subspace of an n-dimensional vector space V, 
then V/M has dimension n — m. Only one small idea is needed to begin 
the proof—after that everything becomes mechanical. The assumption that 
dim V = n means that a basis of V has n elements; the small idea is to 
use a special kind of basis, the kind that begins as a basis of M. To say that 
more precisely, let {21,..., 2m} be a basis for M, and extend it, by adjoining 
suitable vectors £m+1,.---, Zn, SO as to make it a basis of V. From now on 
no more thinking is necessary; the natural thing to try to do is to prove that 
the cosets 


£m+1 +M,..., £n +M 


form a basis for V/M. 

Do they span V/M? That is: if x € V, is the coset x + M necessarily a 
linear combination of them? The answer is yes, and the reason is that x is a 
linear combination of x1, ..., &n, so that 


b 
TtT = 5 AX; 
i=1 

for suitable coefficients. Since 
m 
> ajz 
j=1 

is congruent to 0 modulo M, it follows that 


r+M= X oi(zi +M) 
im 

and that’s exactly what’s wanted. 

Are the cosets 41 +M,..., £n +M independent? Yes, and the reason 
is that the vectors £m+1,.--, £n are independent modulo M. Indeed, if a 
linear combination of these vectors turned out to be equal to a vector, say 
z, in M, then z would be a linear combination of 71,..., £m, and the only 
possible linear combination it could be is the trivial one (because the totality 
of the x’s is independent). 

The proof is complete, and it proved more than was promised: it con- 
cretely exhibited a basis of n — m elements for V/M . 


Solution 53. 


The answer is easy to guess, easy to understand, and easy to prove, but it 
is such a frequently occurring part of mathematics that it’s well worth a 


52 


53 


226 LINEAR ALGEBRA PROBLEM BOOK 


few extra minutes of attention. The reason it is easy to guess is that span and 
dimension behave (sometimes, partially) the same way as union and counting. 
The number of elements in the union of two finite sets is not the sum of their 
separate numbers—not unless the sets are disjoint. If they are not disjoint, then 
adding the numbers counts twice each element that belongs to both sets—the 
sum of the numbers of the separate sets is the number of elements in the 
union plus the number of elements in the intersection. The same sort of thing 
is true for spans and dimensions; the correct version of the formula in that 
case is 


dim(M + N) + dim(M A N) = dimM + dimN. 


The result is sometimes known as the modular equation. 
To prove it, write dim(M N N) = k, and choose a basis 


{z1,.--, 2} 
for MAN. Since a basis for a subspace can always be extended to a basis 
for any larger space, there exist vectors £1, ...,&m such that the set 
{Digests ins Sige seep 


is a basis for M; in this notation 
dimM = m +k. 
Similarly, there exist vectors y1, ..., Yn such that the set 
{Y1 -3 Yn, Z1y ++ -3 Zk} 
is a basis for N; in this notation 
dimN =n +k. 


The span of the x’s is disjoint from N (for otherwise the x’s and z’s together 
couldn’t be independent), and, similarly, the span of the y’s is disjoint from 
M. It follows that the set 


{21, +--+, Lm Yl) +++) Yny 21, +++) Ze} 
is a basis for M + N. The desired equation, therefore, takes the form 
(m+n+k)+k=(m+k)4+(n+k), 


which is obviously true. (Note how the intersection MAN is “counted twice” 
on both sides of the equation.) 
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That’s all there is to it, but the proof has a notational blemish that is 
frequent in mathematical exposition. It is quite possible that some of the 
dimensions under consideration in the proof are 0; the case 


dim(M MN) = 0, 


for instance, is of special interest. In that special case the notation is inap- 
propriate: the suffix on zı suggests that MN N has a non-empty basis, which 
is false. It is not difficult to cook up a defensible notational system in such 
situations, but usually it’s not worth the trouble; it is easier (and no less rig- 
orous) just to remember that in case something is 0 a part of the argument 
goes away. 
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Solution 54. 

(a) The definitions (1) and (3) yield linear transformations; the definition (2) 
does not. The verification of linearity in (1) is boring but easy; just replace 
(x,y) by an arbitrary linear combination 


a1 (1,1) + a2(€2, n2), 


apply T, and compare the result with the result of doing things in the other 
order. Here it is, for the record. Do NOT read it till after trying to write it 
down independently, and, preferably, do not ever read it. 

First: 


T(ai(&i, mi) + azlé2, n2)) 
= T(ai€1 + a2€2, a1 + 272) 
= (a(a1€ +a22)+8(a1m +a2n2), (a1 +22) +6(a1m1 +a2N2)). 


Second: 


aiT (£1, 1) + a2T (£2, N2) 
= aj (agı + Bm, yE1 + 6m) + a2 (as + Bn2, YE2 + n2). 


Third and last: compare the second lines of these equations. 
As for (2): its linearity was already destroyed by the squaring counterex- 
ample in the discussion before the statement of the problem. Check it. 
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The example (3) is the same as (1); the only difference is in the names 
of the fixed scalars. 

(b) As before, the definitions (1) and (3) yield linear transformations and 
the definition (2) does not. To discuss (1), look at any typical polynomial, 
such as, say 


9r? — 3x? + 2x — 5, 
and do what (1) says to do, namely, replace x by 2”. The result is 
9a® — 3x4 + 2x? — 5. 


Then think of doing this to two polynomials, that is, to two elements of P, 
and forming the sum of the results. Is the outcome the same as if the addition 
had been performed first and only then was x replaced by x2? Do this quite 
generally: think of two arbitrary polynomials, think of adding them and then 
replacing x by x”, and compare the result with what would have happened 
if you had replaced x by x? first and added afterward. It’s not difficult to 
design suitable notation to write this down in complete generality, but thinking 
about it without notation is more enlightening—and the answer is yes. Yes, 
the results are the same. That’s a statement about addition, which is a rather 
special linear combination, but the scalars that enter into linear combinations 
have no effect on the good outcome. 

The definition (2) is the bad kind of squaring once more. Counterexam- 
ple: consider the polynomial (vector) p(x) = x and the scalar 2, and compare 
T(2p(x)) with 2T p(x). The first is (2p(a))’, which is 4x”, and the sec- 
ond is 27”. Question: what happens if p(x) is replaced by the even simpler 
polynomial p(x) = 1—is that a counterexample also? 

The discussion of (3) can be carried out pretty much the same way as the 
discussion of (1): instead of talking about linear combinations and replacing 
x by x, talk about linear combinations and multiply them by x?. It doesn’t 
make any difference which is done first—the formula (6) does indeed define 
a linear transformation. 


Solution 55. 


(1) If F is a linear functional defined on a vector space Y, then either F (v) 
is 0 for every vector v in Y, or it is not. (The possibility is a realistic one: 
the equation F (v) = 0 does indeed define a linear functional on every vector 
space.) If F(v) = 0 for all v, then ran F just consists of the vector 0 (in 
Rt), and nothing else has to be said. If that is not the case, then the range 
of F contains some vector xọ in R! (a real number) different from 0. To 
say that xo is in the range means that V contains some vector w such that 
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F (vo) = xo. Since F is a linear functional (linear transformation), it follows 
in particular that 


F(xvo) = zzo 


for every real number x. As x ranges over all real numbers, so does the 
product xx. Conclusion: the range of F is all of Rt. 

(2) The replacement of x by x +2 is a change of variables similar to (but 
simpler than) the replacement of x by x? considered in Problem 54 (1 (b)), 
and the proof that it is a linear transformation is similar to (but simpler than) 
what it was there. Squaring the variable can cause trouble because it usually 
raises the degree of the polynomial to which it is done (usually?—does it ever 
not do so?); the present simple change of variables does not encounter even 
that difficulty. 

(3) The range of this transformation contains only one vector, namely 
(0, 0); it is indeed a linear transformation. 

(4) The equation does not define a linear transformation. Counterexam- 
ples are not only easy to find—they are hard to miss. For a special one, 
consider the vector (0,0,0) and the scalar 2. Is it true that 


T(2-(0,0,0)) = 2-T(0,0,0)? 


The left side of the equation is equal to T (0, 0,0), which is (2, 2); the right 
side, on the other hand, is equal to 2- (2,2), which is (4, 4). 

(5) The “weird” vector space, call it W for the time being, is really the 
easy vector space R! in disguise; they differ in notation only. That statement 
is worth examining in detail. 

Suppose that two people, call them P and Q, play a notation game. Player 
P is thinking of the vector space R!, but as he plays the game he never says 
anything about the vectors that are in his thoughts—he writes everything. His 
first notational whimsy is to enclose every vectorial symbol in a box; instead 
of writing a vector x (in the present case a real number), he writes , and 
instead of writing something like 2 + 3 = 5 or something like 2-3 = 6, he 
writes 


B AB = oe 20B = 6 


(Note: “2” in the last equation is a scalar, not a vector; that’s why its symbol 
is not, should not be, in a box.) Player Q wouldn’t be seriously mystified by 
such a thin disguise. 

Suppose next that the notational change is a stranger one—the operational 
symbols + and - continue to appear in boxes, but the symbols for vectors 
appear as exponents with the base 2. (Caution: vectors, not scalars.) In that 
case every time P thinks of a vector x what he writes is the number s obtained 
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by using x as an exponent on the base 2. Example: P thinks 1 and writes 
2; P thinks 0 and writes 1; P thinks 2 and writes 4; P thinks 4 and writes 
V2; P thinks —3 and writes 4. What will Q ever see? Since s is positive no 
matter what x is (that is, 2” is positive no matter what real number «x is), all 
the numbers that Q will ever see are positive. As x ranges over all possible 
real numbers, the exponential s (that is, 2”) ranges over all possible positive 
real numbers. When P adds two real numbers (vectors), x and y say, what 
he reports to Q is s t, where s = 2” and t = 2”. Example: when P adds 
1 and 2 and gets 3, the report that Q sees is 2 4 = 8. As far as Q is 
concerned the numbers he is looking at were multiplied. 

Scalar multiplication causes a slight additional notational headache. Both 
P and Q are thinking about a real vector space, which means that both are 
thinking about vectors, but P’s vectors are numbers in R! and Q’s vectors 
are numbers in R+. Scalars, however, are the same for both, just plain real 
numbers. When P thinks of multiplying a real number x (a vector) by a real 
number y (a scalar), the traditional symbol for what he gets is yx, but what 
he writes is 


yE]s=t, 


where s = 27 and t = 24”. Notice that 2¥* = (2%)¥, or, in other words, 
t = s”. Example: when P is thinking (in traditional notation) about 3-2 = 6, 
what Q sees is 3 Ç] 4 = 64, which he interprets to mean that the scalar 
multiple of 4 by 3 has to be obtained by raising 4 to the power 3. 

That’s it—the argument shows (doesn’t it?) that R! and R, differ in 
notation only. Yes, R+ is indeed a vector space. If T is defined on R} by 
T(s) = log, s (note: log to the base 2), then T in effect decodes the notation 
that P encoded. When Q applies T to a vector s in R}, and gets logy s, he 
recaptures the notation that P disguised. Thus, in particular, when s = 2” and 
t = 2” and T is applied to s t, the result is loga (27 - 2%), which is x + y. 
In other words, 


T(s[+]t)=Ts+Tt, 


which is a part of the definition of a linear transformation. The other part, the 
one about scalar multiples goes the same way: if s = 2” and t = 247, then 


T(y E] $) = T(t) = yx = yT(s). 


There is nothing especially magical about logs; logarithms to other bases 
could have been used just as well. Just remember that log), s, for instance, 
is just a constant multiple of log. s—in fact 


logig s = (logy, 2) - logy $ 
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for every positive real number s. If T had been defined by 


T(s) = logio $, 


the result would have been the same; the constant factor log,, s just goes 
along for the ride. 


Solution 56. 


(1) What do you know about a function if you know that its indefinite in- 
tegral is identically 0? Answer: the function must have been 0 to start with. 
Conclusion: the kernel of the integration transformation is {0}. 

(2) What do you know about a function if you know that its derivative 
is identically 0? Answer: the function must be a constant. Conclusion: ker D 
is the set of all constant polynomials. 

(3) How can it happen that 


22+ 3y =0 
and 
Tz —57=0? 
To find out, eliminate x. Since 
7-2x+7-3y=0 
and 
2-7x—2-5y=0, 
therefore 
2ly — 10y = 0, 
or y = 0, and from that, in turn, it follows that 
2x + 3y = 2z +3 -0 = 227 =0, 


and hence that « = 0. Conclusion: ker T = { (0, 0) }. 
(4) How can it happen for a polynomial p that p(x?) = 0? Recall, for 
instance, that if 


p(x) = 923 — 3x? + 2x — 5, 
then 


p(z?) = 9n° — 3x + 2x? — 5; 


the only way that can be 0 is by having all its coefficients equal to 0, which 
happens only when all the coefficients of p were 0 to begin with. (See Problem 
54 (2 (a)).) Conclusion: the kernel of this change of variables is {0}. 
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(5) To say that T(x, y) = (0, 0) is the same as saying that (x, 0) = (0,0), 
and that is the same as saying that x = 0. In other words, if (x,y) is in the 
kernel of T, then (x, y) = (0, y). Conclusion: ker T is the y-axis. 

(6) This is an old friend. The question is this: for which vectors (x, y) 
in R? is it true that x + 2y = 0? Answer: the ones for which it is true, and 
nothing much more intelligent can be said about them, except that the set was 
encountered before and given the name Rẹ. (See Problem 22.) 


Solution 57. 


(1) The answer is yes: the stretching transformation, which is just scalar mul- 
tiplication by 7, commutes with every linear transformation. The computation 
is simple: if v is an arbitrary vector, then 


(ST)v = S(Tv) by the definition of composition 
= 7(Tv) by the definition of S 
and 
(TS)v = T(Sv) by the definition of composition 
=T(7v) by the definition of S 
= 7(Tv) by the linearity of T. 


The number 7 has, of course, nothing to do with all this: the same con- 
clusion is true for every scalar transformation. (For every scalar ~y the linear 
transformation S defined for every vector v by Sv = yu is itself called a 
scalar. Words are stretched by this usage, but in a harmless way, and breath 
is saved.) The proof is often compressed into one line (slightly artificially) as 
follows: 


(ST)v = S(Tv) =c(Tv) = T (cv) = T(Sv) = (TS)v. 


(2) The question doesn’t make sense; S: R3 — R? (that is, S is a trans- 
formation from R? to R?) and T: R3 — R?, so that TS can be formed, but 
ST cannot. 

(3) If p(x) = «a, then ST p(x) (a logical fussbudget would write 
((ST)p) (x), but the fuss doesn’t really accomplish anything) = STx = 
x? x = 23 and TSx = Tx? = x? - x? = x+—and that’s enough to prove 
that S and T do not commute. 

A student inexperienced with thinking about the minimal, barebones, 
extreme cases that are usually considered mathematically the most elegant 
might prefer to examine a more complicated polynomial (not just x, but, say, 
1+2zx + 3x7). For the brave student, however, there is an even more extreme 
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case to look at (more extreme than zx): the polynomial p(x) = 1. The action of 
T on 1 is obvious: T1 = x”. What is the action of S on 1? Answer: the result 
of replacing the variable x by x? throughout—and since x does not explicitly 
appear in 1, the consequence is that S1 = 1. Consequence: ST1 = Sx? = x4 
and TS1 = T1 = x”. Conclusion: (as before) S and T do not commute. 

To say that S and T do not commute means, of course, that the com- 
positions ST and TS are not the same linear transformation, and that, in 
turn, means that they disagree at at least one vector. It might happen that 
they agree at many vectors, but just one disagreement ruins commutativity. 
Do the present ST and TS agree anywhere? Sure: they agree at the vector 
0. Anywhere else? That’s a nice question, and it’s worth a moment’s thought 
here. Do ST and TS agree at any polynomial other than 0? Since 


STp(z) = S(2?p(x)) = x*p(a*) 
and 
TSp(x) = Tp(x”) = x? p(x”), 


the question reduces to this: if p 4 0, can x*p(x?) and «?p(a?) ever be the 
same polynomial? The answer is obviously no: if that equation held for p Æ 0, 
it would follow that x? = x4, which is ridiculous. (Careful: x? = x4 is not 
an equation to be solved for an unknown z. It offers itself as an equation, an 
identity, between two polynomials, and that’s what’s ridiculous.) 

(4) Since S: R? — Rt! and T:R! — R?, both products ST and TS make 
sense, and, in fact 


ST:R! — R! and TS: R? = R?. 


It may be fun to calculate what ST and TS are, but for present purposes it 
is totally unnecessary. The point is that ST is a linear transformation on R! 
and TS is a linear transformation on R?; the two have different domains and 
it doesn’t make sense to ask whether they are equal. No, that’s not correctly 
said: it makes sense to ask, but the answer is simply no. 

(5) To decide whether STp(x) = TSp(x) for all p, look at a special 
case, and, in particular, look at an extreme one such as p(x) = 1, and hope 
that it solves the problem. Since S1 = 1 and T1 = 1, it follows that ST1 = 
TS1 = 1. Too bad—that doesn’t settle anything. What about p(x) = x? 
Since STx = SO = 0 and TSx = T(x + 2) = 2, that does settle something: 
S and T do not commute. 

(6)-(1) The scalar 7 doesn’t affect domain, range, or kernel: the question 
is simply about dom T, ran T, and ker T. Answer: 


dom T = ranT = R?, and ker T = {0}. 
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(6)-(2) Since TS(a, y, z) = T(7x, Ty, 7z) = (Tx, Ty), it follows easily 
that dom TS = R, ran TS = R?, and ker TS is the set of all those vectors 
(x,y,z) in R3, for which x = y = 0, that is, the z-axis. (Look at the whole 
question geometrically.) Since there is no such thing as ST, the part of the 
question referring to it doesn’t make sense. 

(6)-(3) The domains are easy: dom ST = dom TS = P. The kernels are 
easy too: since 


ST p(x) = Sx*p(x) = x*p(x”) 
and 

TSp(x) = Tp(x”) = z°p(2°), 
it follows that ker ST = kerTs = {0}. The question about ranges takes 
a minute of thought. It amounts to this: which polynomials are of the form 
x?p(x?), and which are of the form xp(x?)? Answer: ran TS is the set of 
all even polynomials with 0 constant term, and ran ST is the set of all those 
even polynomials in which, in addition, the coefficient of x? is 0 also. 

(6)-(4) Now is the time to calculate the products: 

STx = S(x, £) = £ +2 + 3a 

and 
TS(x,y) = T(x + 2y) = (x + 2y, £ + 2y). 


Answers: dom ST = Rt, dom TS = R?, ran ST = R!, ranTS is the 
“diagonal” consisting of all vectors (x, y) in R? with z = y, ker ST = {0}, 
and ker T'S is the line with the equation x + 2y = 0. 

(6)-(5) To find the answer to (5) the only calculations needed were for 
STx and T'Sx. To get more detailed information, more has to be calculated, 
as follows: 


ST(a + Ba + yx? + ôx?) = Sla +x") 
=a +yz +2)? = (a +27) + 4yr + 4y2? 
and 
TS(a + Bx + yx" + ôx?) = T(a t+ B(x +2) + 7(x + 2)? + 6(a + 2)’) 
= (a+ 26 + 4y + 85) + (y + 65) x? 


There is no trouble with domains (both are P3). The range of ST is the set 
of all those quadratic polynomials for which the coefficients of x and x? are 
equal, and the range of TS is the set of all those quadratic polynomials for 
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which the coefficient of x is 0. The kernel of ST is the set of all those cubic 
polynomials, that is polynomials of the form 


at Br + yx? + 62°, 


for which a = y = 0, and the kernel of TS is the set of all those whose 
coefficients satisfy the more complicated equations 


a + 28 + 4y + 88 = y + 6ô = 0. 


Solution 58. 


Yes, ran A C ran B implies the existence of a linear transformation T such 
that A = BT. The corresponding necessary condition for right divisibility, 
A = SB, is 


ker B C ker A, 


and it too is sufficient. 

The problem is, given a vector x in the vector space V, to define Tz, 
and, moreover, to do it so that Ax turns out to be equal to BT'x. Put y = Az, 
so that y € ran A; the assumed condition then implies that y € ran B. That 
means that y = Bz for some z, and the temptation is to define Tx to be z. 
That might not work. The difficulty is one of ambiguity: z is not uniquely 
determined by y. It could well happen that y is equal to both Bz, and Bz); 
should Tx be zı or 22? 

If Bz, = Bzz, then B(z, — z2) = 0, which says that 


zı — 22 € ker B. 


The way to avoid the difficulty is to stay far away from ker B, and the way 
to do that is to concentrate, at least temporarily, on a complement of ker B. 
Very well: let M be such a complement, so that 


Mn ker B = {0} and M + ker B = YV. 


Since B maps ker B to {0}, the image of M under B is equal to the 
entire range of B, and since M has only 0 in common with ker B, the map- 
ping B restricted to M is one-to-one. It follows that for each vector x there 
exists a vector z in M such that Ax = Bz, and, moreover, there is only one 
such z; it is now safe to yield to temptation and define Tx to be z. The 
conceptual difficulties are over; the rest consists of a routine verification that 
the transformation T so defined is indeed linear (and, even more trivially, that 
A= BT). 
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As for right divisibility, A = SB, the implication from there to ker B C 
ker A is obvious; all that remains is to prove the converse. A little experimen- 
tation with the ideas of the preceding proof will reveal that the right thing to 
consider this time is a complement N of ran B. For any vector x in ran B, 
that is, for any vector of the form By, define Sx to be Ay. Does that make 
sense? Couldn’t it happen that one and the same x is equal to both By; and 
By, so that Sx is defined ambiguously to be either Ayı or Ay2? Yes, it 
could, but no ambiguity would result. The reason is that if By; = By2, so 
that yı — y2 € ker B, then the assumed condition implies that yı — yı € ker A, 
and hence that Ayı = Ayı. Once S is defined on ran B, it is easy to extend 
it to all of V just by setting it equal to 0 on N. The rest consists of a routine 
verification that the transformation S so defined is indeed linear (and, even 
more trivially, that A = SB). 


Solution 59. 


The questions have interesting and useful answers in the finite-dimensional 
case; it is, therefore, safe and wise to assume that the underlying vector space 
is finite-dimensional. 

(1) If the result of applying a linear transformation A to each vector 
in a total set is known, then the entire linear transformation is known. It is 
instructive to examine that statement in a simple special case; suppose that 
the underlying vector space is R?. If 


A(l, 0) = (a, y) 
and 


A(0, 1) = (8, ô) 


(there is a reason for writing the letters in a slightly non-alphabetic order 
here), then 


A(z, y) = z(a, Y) + (6,6) = (ax + By, yx + dy) 


(and the alphabet has straightened itself out). 

The reasoning works backwards too. Given A, corresponding scalars a, 
GB, y, 6 can be found (uniquely); given scalars a, 8, y, ô, a corresponding 
linear transformation A can be found (uniquely). 

The space R? plays no special role in this examination; every 2-dimen- 
sional space behaves the same way. And the number 2 plays no special role 
here; any finite-dimensional space behaves the same way. The only difference 
between the low and the high dimensions is that in the latter more indices 
(and therefore more summations) have to be juggled. Here is how the juggling 
looks. 
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Given: a linear transformation A on a vector space V with a prescribed 
total set, and an arbitrary vector x in V. Procedure: express x as a linear 
combination of the vectors in the total set, and deduce that the result of 
applying A to x is the same linear combination of the results of applying A to 
the vectors of the total set. If, in particular, V is finite-dimensional, with basis 
{e1, €2,..-,@n}, then a linear transformation A is uniquely determined by 
specifying Ae; for each j. The image Ae; is, of course, a linear combination 
of the e;’s, and, of course, the coefficient of e; in its expansion depends 
on both i and j. Consequence: Ae; has the form ee Qijei. In reverse: 
given an array of scalars qj; (1 = 1,...,n; j = 1,...,n), a unique linear 
transformation A is defined by specifying that 


n 
Ae; = ò Qijei 
i=1 


for each j. Indeed, if 


n 
TtT = ò Yjej, 
j=1 


then 


n n n n n 
Ar = 5 5 Ae; = 5 Yj 5 QijYi = DA Qij Yj €j. 
j=1 j=l i=l i=1 \j=1 

The conclusion is that there is a natural one-to-one correspondence be- 
tween linear transformations A on a vector space of dimension n and square 
arrays (matrices) {a;;} (1 = 1,...,n; j = 1,...,n). Important comment: 
linear combinations of linear transformations correspond to the same linear 
combinations of arrays. If, that is, 


n n 
Ae; = > Qijei and Be; = > Bij ei, 
i=1 i=1 


then 


n 


(aA + BB)e; = X (ai + bbij )ei- 


i=1 


Each {a;;} has n? entries; except for the double subscripts (which are hardly 
more than a matter of handwriting) the a;;’s are the coordinates of a vector in 


R”. Conclusion: the vector space L(V) is finite-dimensional; its dimension 
is n?. 

(2) Consider the linear transformations 1, A, A?,..., A”. They constitute 
n? +1 elements of the vector space L(V) of dimension n?, and, consequently, 
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they must be linearly dependent. The assertion of linear dependence is the 
assertion of the existence of scalars ag, @1,...,@,2 such that 
2 
ao +ayA+:::+a,2A” =0, 
and that, in turn, is the assertion of the existence of a polynomial 


2 
Qo Oye sa Gen” 


such that p(A) = 0. Conclusion: yes, there always exists a non-zero polyno- 
mial p such that p(A) = 0. 

(3) If A is defined by Ax = yo(x), then 

A?x = A[Aa] = yo(x) Axo = yo(x)[yo(xo0)x0] = yo(xo) Az. 
In other words: A?x is a scalar multiple (by the scalar yo(ao)) of Az, or, 
simpler said, A? is a scalar multiple (by the scalar yo(xo)) of A. Differently 
expressed, the conclusion is that if p is the polynomial (of degree 2) defined 
by 
p(t) = t? — yo(o)t, 

then p(A) = 0; the answer to the question is 2. 


Solution 60. 


Suppose that T is a linear transformation with inverse T71 on a vector space 
Y. If vı and vs are in V with Tu, = vı and Tuz = ve, then 


T! (v; + v2) = T7! (Tu, + Tue) 
=T! (T(u1 +u2)) (because T is linear) 
=u; +u2 (by the definition of T7!) 
= Ttw; +T tw (by the definition of T~1), 


and, similarly, if v is an arbitrary vector in V, a is an arbitrary scalar, and 
v = Tu, then 


T (av) =T7*(a(Tu)) = T7! (T(au)) = au = a(T7'v) 


—q.e.d. 


Solution 61. 
(1) What is the kernel of T? That is: for which ($) does it happen that 


(seen) = (0) 
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Exactly those for which 2€ + 7 = 0, or, in other words, 7 = —2€, and that’s 
a lot of them. The transformation T has a non-trivial kernel, and, therefore, 
it is not invertible. 

(2) The kernel question can be raised again, and yields the answer that 


both £ and 7 must be 0; in other words the only e in the kernel is e ; 


That suggests very strongly that T is invertible, but a really satisfying answer 
to the question is obtained by forming T?. Since all that T does is interchange 
the two coordinates of whatever vector it is working on, T? interchanges them 
twice—which means that T? leaves them alone. Consequence: T? = 1, or, in 
other words, T7} = T. 

(3) The differentiation transformation D on Ps is not invertible. Reason 
(as twice before in this problem): D has a non-trivial kernel. That is: there 
exist polynomials p different from 0 for which Dp = 0—namely, all constant 
polynomials (except 0). 


Solution 62. 


Both assertions are false. 
For (1), take (œ) and (3) to be invertible, and put (y) = (@)7?, 
(5) = (a)~". In that case 


m=( gr gy) 


which makes it obvious that all four formal determinants are equal to the 
matrix 0. If, in particular, 


@=(7 w @=(4 i) 


then 
1 0 1 —1 
fly os Site. < 
(a) E 3 and (8) a ( 1 J 
so that 
1 0 1 1 
1 1 0 1 
M 1 -1 1 0 
0 1 -1 1 


The point is that M is invertible. Such a statement is never obvious— 
something must be proved. The simplest proof is concretely to exhibit the 
inverse, but the calculation of matrix inverses is seldom pure joy. Be that as 
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it may, here it is; in the present case 


e 0 
M Y=], 
1 -1 0 1 

For (2), take (œ) involutory ((a)? = 1) and (8) nilpotent of index 2 
((6)? = 0), and put (y) = (8), (6) = (a). In that case 


mo. 


which makes it obvious that all four formal determinants are equal to the 
identity matrix 1. If, in particular, 


@=(1 5) m D=] 4): 


then 


= 
II 
=. OF © 
oo Ff 
Ee oS SO 
=. o o 


0 0 


Since the first and third columns of M are equal, so that M sends the first and 
third natural basis vectors to the same vector, the matrix M is not invertible. 


Solution 63. 


The problem of evaluating det M; calls attention to a frequently usable obser- 
vation, namely that the determinant of a direct sum of matrices is the product 
of their determinants. (The concept of direct sums of matrices has not been 
defined—is its definition guessable from the present context?) Since 


1 2 3 4 
act (5 i)=-3 and act ({ 3) =77 


it follows that det Mı = 21. 

If a matrix has two equal columns (or two equal rows?), then it is not 
invertible, and, therefore, its determinant must be 0. The matrix Mə has two 
equal rows (for instance, the first and the fifth, and also the second and the 
fourth) and therefore det Mə = 0. 

The simplest trick for evaluating det M3 is to observe that M3 is similar 


2 
a (The concept of 


similarity of matrices has not been defined yet—is its definition guessable 


to the direct sum of three copies of the matrix 


SOLUTIONS: Chapter 4 241 


from the present context?) The similarity is achieved by a permutation matrix. 
What that means, in simple language, is that if the rows and columns of M3 
are permuted suitably, M3 becomes such a direct sum. Since 


3.2 
act (5 Ale 


it follows that det M3 = 5° = 125. 


Solution 64. 


If n = 1, then (1) is the only invertible 01-matrix and the number of its 
entries equal to 1 is 1; that’s an uninteresting extreme case. When n = 2, the 


optimal example is 
1 1 
0 1 


with three 1’s. What happens when n = 3? 
The invertible matrix 


has six 1’s; can that be improved? There is one and only one chance. An extra 
1 in either the second column or in the second row would ruin invertibility; 
what about an extra 1 in position (3, 1)? It works: the matrix 


1 1 1 
0 1 1 
1 0 1 


is invertible. An efficient way to prove that is to note that its determinant is 
equal to 1. 

Is the general answer becoming conjecturable? The procedure is induc- 
tive, and the general step is perfectly illustrated by the passage from 3 to 4. 
Consider the 4 x 4 matrix 


ee Be 


1 1 
0 1 
1 1]? 
1 1 


e O m. e 


0 


and expand its determinant in terms of the first column. The cofactor of the 
(1, 1) entry is invertible by the induction assumption. The (2,1) entry is 0, 
and, therefore, contributes 0 to the expansion. The cofactor of the (3, k) entry, 
for k > 2, contains two identical rows, namely the first two rows that consist 
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entirely of 1’s—it follows that that cofactor contributes 0 also. Consequence 
(by induction): the matrix is invertible. 

The number of 1’s in the matrix here exhibited is obtained from n? by 
subtracting the number of entries in the diagonal just below the main one, 
and that number is n — 1. This proves that the number of 1’s can always be 
as great as n? — n + 1. 

Could it be greater? If a matrix has as many as n?—n +2 (= n?—(n—2)) 
entries equal to 1, then it has at most n — 2 entries equal to 0. Consequence: 
it must have at least two rows that have no 0’s in them at all, that is at least 
two rows with nothing but 1’s in them. A matrix with two rows of 1’s cannot 
be invertible. 


Comment. Can the desired invertibilities be proved without determinants? 
Yes, but the proof with determinants seems to be quite a bit simpler, and even, 
in some sense, less computational. 


Solution 65. 


Yes, L(V) has a basis consisting of invertible linear transformations. One 
way to construct such a basis is to start with an easy one that consists of 
non-invertible transformations and modify it. The easiest basis of L(Y) is the 
set of all customary matrix units: they are the matrices E'(i, j) whose (p, q) 
entry is ô(i, p)d(j, q), where 6 is the Kronecker delta. (The indices i, j, p, q 
here run through the values from 1 to n.) In plain language: each E(i, 7) has 
all entries except one equal to 0; the non-zero entry is a 1 in position (7, 7). 
Example: if n = 4, then 


The n? matrices E(i, j) constitute a basis for the vector space L(V), but, 
obviously, they are not invertible. If 


Fi,j) = EG j) +1 
(where the symbol “1” denotes the identity matrix), then the matrices 
F(i, j) are invertible—that’s easy—and they span L(V)—that’s not obvious. 
Since there are n? of them, the spanning statement can be proved by showing 


that the F (i, 7)’s are linearly independent. 
Suppose, therefore, that a linear combination of the F’’s vanishes: 


X ai, JF (i, j) = 0, 


a 
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or, in other words, 
X= 5 a(i, j): 1+ ` a(i, j)E(i, j) = 0. 
ij ij 


If p £ q, then the (p, q) entry of X is 0+ a(p, q), and therefore a(p, q) = 0. 
What about the entries a(p, p)? The (p, p) entry of X is 


E alii) + alr), 

ij 
which is therefore 0. But it is already known that a(i, j) = 0 when i = J, 
and it follows that 


a(p, p) + p a(i,i) =0 


for each p. Consequence: the a(p,p)’s are all equal (!), and, what’s more, 
their common value is the negative of their sum. The only way that can happen 
is to have a(p, p) = 0 for all p—and that finishes the proof that the F’s are 
linearly independent. 


Solution 66. 


The answer is that on a finite-dimensional vector space every injective linear 
transformation is surjective, and vice versa. 

Suppose, indeed, that {u1, u2,...,Un} is a basis of a vector space V 
and that T is a linear transformation on V with kernel {0}. Look at the 
transformed vectors Tu, Tu2,..., Tun: can they be dependent? That is: can 
there exist scalars a1, a@2,...,@n such that 


ayTuy + aeTug +--+: +anTun = 0? 
If that happened, then (use the linearity of T) it would follow that 
T(œrui + agua +: + anun) = 0, 
and hence that 
aiui + QU +--+ + AnUn = 0 


(here is where the assumption about the kernel of T is used). Since, however, 
the set 


{ui, Ua,.-.-,Un} 
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is independent, it would follow that all the œ’s are 0—in other words that the 
transformed vectors 


Tui, Tu2,..., Tun 


are independent. An independent set of n vectors in an n-dimensional vector 
space must be a basis (if not, it could be enlarged to become one, but then 
the number of elements in the enlarged basis would be different from n— 
see Problem 42). Since a basis of V spans V, it follows that every vector is 
a linear combination of the vectors Tu, Tu2,..., Tun and hence that the 
range of T is equal to V. Conclusion: ker T = {0} implies ran T = Y. 

The reasoning in the other direction resembles the one just used. Suppose 


this time that {u1, u2, . . ., Un} is a basis of a vector space V and that T is a 
linear transformation on V such that ran T = Y. Assertion: the transformed 
vectors Tu, Tu2,..., Tun span V. Reason: since, by assumption every vec- 


tor v in V is the image under T of some vector u, and since every vector u 
is a linear combination of the form 
ayuy + agua +`: F Anun, 
it follows indeed that 
v = Tu = T (œu + agus +: + Anun) 
= Tui + aT u +--+ anT un. 


Since a total set of n vectors in an n-dimensional vector space must be a basis 
(if not, it could be decreased to become one, but then the number of elements 
in the enlarged basis would be different from n—see Problem 42), it follows 


that the transformed vectors Tu , Tu2,..., Tun are independent. If now u is 
a vector in ker T, then expand u in terms of the basis {u1, u2, ..., Un}, so 
that 


u = Q1U1 + AQU2 +- + AnUn, 
infer that 
0 = Tu = Tui + a2gTug +--+ AnTUn, 
and hence that the a’s are all 0. Conclusion: ran T = V implies ker T = {0}. 
Comment. The differentiation operator D on the vector space Ps is neither 
injective nor surjective; that’s an instance of the result of this section. The 
differentiation operator D on the vector space P is surjective (is that right?), 


but not injective. The integration operator T (see Problem 56) is injective but 
not surjective. What’s wrong? 
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The answer is that nothing is wrong; the theorem is about finite-dimen- 
sional vector spaces, and P is not one of them. 


Solution 67. 


If the dimension is 2, then there are only two ways a basis (consisting of 
two elements) can be permuted: leave its elements alone or interchange them. 
The identity permutation obviously doesn’t affect the matrix at all, and the 
interchange permutation interchanges the two columns. 

It is an easy (and familiar?) observation that every permutation can be 
achieved by a sequence of interchanges of just two objects, and, in the light of 
the comment in the preceding paragraph, the effect of each such interchange 
is the corresponding interchange of the columns of the matrix. It is, however, 
not necessary to make use of the achievability of permutations by interchanges 
(technical word: transpositions); the conclusion is almost as easy to arrive at 
directly. If, for instance, the dimension is 3, if a basis is {e1, e2, e3}, and if 
the permutation under consideration replaces that basis by {e3, e1, e2}, then 
the effect of that replacement on a matrix such as 


Q11 Q12 Q13 
Q21 Q22 Q23 
Q31 Q32 Q33 


produces the matrix 


Solution 68. 


To say that {aiz} is a diagonal matrix is the same as saying that aj; = ai;di; 
for all i and j (where 6,; is the Kronecker delta, equal to 1 or 0 according as 
i = j or i Æ j). If B = {ij}, then the (i, j) entry of AB is 


n 
X aikðikbrj = Qipi 
k=1 


(because the presence of ĝi makes every term except the one in which k = i 
equal to 0), and the (i, 7) entry of BA is 


n 
5 Binder Oki = Big O53. 
k=1 
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If i Æ j, then the assumption about the diagonal entries says that ay; 4 aj;, 
and it follows therefore, from the commutativity assumption, that G;; must be 
0. Conclusion: B is a diagonal matrix. 


Solution 69. 


If B commutes with every A, then in particular it commutes with every diag- 
onal A with distinct diagonal entries, and it follows therefore, from Problem 
68, that B must be diagonal—in the sequel it may be assumed, with no loss 
of generality, that B is of the form 


BG 0 0 0 
0 fb 0 O 
0 0 B 0 
0 0 0 A 


At the same time B commutes with the matrices of all those linear transfor- 
mations that leave fixed all but two entries of the basis. In matrix language 
those transformations can be described as follows: let p and q be any two 
distinct indices, and let C be obtained from the identity matrix by replacing 
the 1’s in positions p and q by 0’s and replacing the 0’s in positions (p, q) 
and (q, p) by 1’s. Typical example (with n = 4, p = 2, and q = 3): 


1 0 0 0 
00 1 0 
oF 01 0 (0 
0 0 0 1 
Since 
6 0 0 0 
|0 0 f O 
POS 0 6 0 0 
0 0 0 Ø 
and 
6 0 0 0 
|0 0 @& 0 
Las 0 fb 0 0 
0 0 0 Ø 


it follows that 62 = 83. It’s clear (isn’t it?) that the method works in general 
and proves that all the G’s are equal. 
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Solution 70. 


Consider the linear transformation 


oo) 


or, more properly speaking, consider the linear transformation A on R? defined 
by the matrix shown. Note that if u = (a,) is any vector in R?, then 
Au = (8,0). Consequence: if M is an invariant subspace that contains a 
vector (a, 3) with 8 4 0, then M contains (6,0) (and therefore (1,0)), and 
it follows (via the formation of linear combinations) that M contains (0, 3) 
(and therefore (0, 1)). In this case M = R?. 

If M is neither O nor R?, then every vector in M must be of the form 
(a,0), and the set Mı of all those vectors do in fact constitute an invariant 
subspace. Conclusion: the only invariant subspaces are O, Mı, and R?. 


Solution 71. 


If D is the differentiation operator on the space P,, of polynomials of degree 
less than or equal to n, and if m < n, then P, is a subspace of P,,, and the 
subspace Pm is invariant under D. Does Pm have an invariant complement 
in Pa? 

The answer is no. Indeed, if p is a polynomial in P, that is not in 
Pm, in other words if the degree k of p is strictly greater than m, then 
replace p by a scalar multiple so as to justify the assumption that p is monic 
(p(t) = të +a,_it*—!+---+ a9). If p belongs to a subspace invariant under 
D, then Dp, D?p,... all belong to that subspace, and, therefore, so does the 
polynomial D¥-™p, which is of degree m. Consequence: every polynomial 
has the property that if D is applied to it the right number of times, the result 
is in Pm. Conclusion: Pm can have no invariant complement. 


Comment. If n = 1, then Pn (= Pi) consists of all polynomials a + 6t of 
degree 1 or less, and D sends such a polynomial onto the constant polynomial 
B (= 6+0- t). That is only trivially (notationally) different from the set 
of ordered pairs (a, 3) with the transformation that sends such a pair onto 
(3,0)—in other words in that case the present solution reduces to Solution 
69. 


Solution 72. 


A useful algebraic characterization of projections is idempotence. Explana- 
tion: to say that a linear transformation A is idempotent means that 
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A? = A. (The Latin forms “idem” and “potent” mean “same” and 
“power”.) In other words, the assertion is that if Æ is a projection, then 
E? = E, and, conversely, if E? = F, then E is a projection. 

The idempotence of a projection is easy to prove. Suppose, indeed, that 
E is the projection on M along N. If z = x + y is a vector, with x in M and 
y in N, then Ez = zx, and, since x = x + 0, so that Ex = zx, it follows that 
E?z = Ez. 

Suppose now that E is an idempotent linear transformation, and let M and 
N be the range and the kernel of E respectively. Both M and N are subspaces; 
that’s known. If z is in M, then, by the definition of range, z = Eu for some 
vector u, and if z is also in N, then, by the definition of kernel, Ez = 0. 
Since E = E?, the application of E to both sides of the equation z = Eu 
implies that Ez = z; since, at the same time, Ez = 0, it follows that z = 0. 
Conclusion: MN N = O. 

If z is an arbitrary vector in V, consider the vectors 


Ez and z- Ez (=(1-E)z); 
call them x and y. The vector x is in ran E, and, since 
Ey = Ez — E?z=0, 
the vector y is in ker E. Since z = x + y, it follows that 
M+N=V. 


The preceding two paragraphs between them say exactly that M and N 
are complementary subspaces and that the projection of any vector z to M 
along N is equal to Ez—that settles everything. Note, in particular, that the 
argument answers both questions: projections are just the idempotent linear 
transformations, and if FE is the projection on M along N, then ran E = M 
and ker E = N. 

It is sometimes pleasant to know that if E is a projection, then ran Æ 
consists exactly of the fixed points of E. That is: if z is in ran E, then 
Ez = z, and, trivially, if hz = z, then z is in ran E. 


Solution 73. 


If E and F are projections such that E + F also is a projection, then 
(E+F? =E +F, 
which says, on multiplying out, that 


EF+FE=0. 
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Multiply this equation on both left and right by E and get 


EF+EFE=0 and EFE+FE=0. 
Subtract one of these equations from the other and conclude that 
EF-FE=0, 
and hence (since both the sum and the difference vanish) 


EF=FE=0. 


That’s a necessary condition that E + F be a projection. 

It is much easier to prove that the condition is sufficient also: if it is 
known that EF = FE = 0, then the cross product terms in (E + F}? 
disappear, and, in view of the idempotence of E and F separately, it follows 
that E + F is idempotent. 

Conclusion: the sum of two projections is a projection if and only if their 
products are 0. (Careful: two products, one in each order.) 


Question. Can the product of two projections be 0 in one order but not the 
other? Yes, and that takes only a little thought and a little experimental search. 


If 
_f1 0 _fa B 
ae) ee 


and EF = 0, then a = p = 0. The resulting F = is idempotent if 


ô 
and only if either y = ô = 0 or else ô = 1. A pertinent example is 


To 


in that case EF = 0 and FE £0. 


Solution 74. 74 


The condition is that E? = F’; a strong way for a linear transformation to 
satisfy that is to have Æ? = 0. Is it possible to have F? = 0 without E = 0? 
Sure; a standard easy example is 


0 1 
R= , 
(o 0) 
In that case, indeed, Æ? (1 — E) = 0, but E(1 — E) = 0 is false. That settles 
the first question. 
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It is easy to see that the answer to the second question is no—for the Æ 
just given it is not true that E(1 — E)? = 0 (because, in fact, E(1 — E}? = 
E — F°). 

That answers both questions, but it does not answer all the natural ques- 
tions that should be asked. 

One natural question is this: if Æ(1 — E)? = 0, does it follow that E is 
idempotent? No—how could it? Just replace the E used above by 1 — E—that 


is, use 
1 —1 
0 1 
0 1 
-E= 
Bo): 


so that (1 — E)? = 0, and therefore E(1 — E)? = 0, but it is not true that 
FE? =E. 
Another natural question: if both 


as the new E. Then 


FP(-E)=0 and E(1-—£)?=0, 


does it follow that E is idempotent? Sure: add the two equations and simplify 
to get E — F? =0. 


Chapter 5. Duality 


Solution 75. 


If 7 = 0, then € = 0, everything is trivial and the conclusion is true. In 
the remaining case, consider a vector x such that n(xo) 4 0, and reason 
backward. That is, assume for a moment that there does exist a scalar a such 
that (x) = an(«) for all x, and that therefore, in particular, (xo) = an(xo), 
and infer that 


ms E(zo) 
(xo) 
[Note, not a surprise, but pertinent: it doesn’t matter which zo was picked 
—so long as ņn(xo) Æ 0, the fraction gives the value of a. Better said: if there 
is an q, it is uniquely determined by the linear functionals € and n.] 
Now start all over again, and go forward (under the permissible assump- 
tion that there exists a vector xo such that ņn(xzo) 4 0). The linear functional 
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7 sends 


and hence it sends 


to 1, 
n(zo) 
and hence 
YTo to y 
(Zo) 
for every scalar y. Special case: 7 sends 
Ko o nfa) 
n(zo) 


for every vector x. Consequence: 


o 


for all x. The relation between € and 7 now implies that 


oe 


for all x, which says exactly that 


= E(x0) 
and that is what was wanted (with a = Elzo) ). 


n(xo) 


Solution 76. 


Yes, the dual of a finite-dimensional vector space V is finite-dimensional, and 
the way to prove it is to use a basis of V to construct a basis of V’. Suppose, 
indeed, that {21,22,...,2,} is a basis of V. Plan: find linear functionals 
£1,€,...,€n that “separate” the x’s in the sense that 


Ei(xj) = dij 


for each i, 7 = 1,2,...,n. Can that be done? If it could, then the value of €; 
at a typical vector 


£ = Aye, + Qaa +`- + Ann 
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would be a;—and that shows how €; should be defined when it is not yet 
known. That is: writing €;(a2) = a; for each į does indeed define a linear 
functional (verification?). 

The linear functionals £1, £2, . . . , n are linearly independent. Proof: if 


for all x, then, in particular, the linear combination vanishes when x = xj 
(j =1,...,n), which says exactly that G; = 0 for each j. 

Every linear functional is a linear combination of the &;’s. Indeed, if £ 
is an arbitrary linear functional and x = a,x, + Q&2£2 + -° + AnZp is an 
arbitrary vector, then 

Elx) = Elart + azt +-+ antn) 
a€(x1) + a2€(@2) +- + an€(Ln) 
= 1 (2 )€(#1) + 2(x)§ (v2) +--+ + &n(@)E (an) 
= (€1(a1)€1 + €2(w2)€o + +--+ En(an)En) (2). 


The preceding two paragraphs yield the conclusion that the x’s constitute 
a basis of V’ (and hence that V’ is finite-dimensional). 


II 


Corollary. dim V’ = dim V =n. 


Solution 77. 


The answer is yes: the linear transformation T defined by 


T(z) =T(a1,..-,%n) = (yitar) 


is invertible. One reasonably quick way to prove that is to examine the kernel 
of T. Suppose that x = (z1,..., £n) is a vector in F” that belongs to the 
kernel of T, so that 


(y(x), 6 ous Yn(2)) = (0,...,0). 


Since the coordinate projections pj belong to the span of y1, ..., Yn, it follows 
that for each j there exist scalars a1,...,@,, such that 


Pj = 5 AkYk- 
k 


Consequence: 


p;(z) = > aryr = 0, 
k 
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for each j, which implies, of course, that x = 0; in other words the kernel of 
T is {0}. Conclusion: T is invertible. 


Solution 78. 


The verification that T is linear is the easiest step. Indeed: if x and y are in 
V and a and @ are scalars, then 


T(ax + By)(u) = ulax + By) = au(x) + Buly) 


= a(Tx)(u) + B(Ty)(u) 
= (aT x + BTy)(u). 


How can it happen (for a vector x in V) that Tx = 0 (in V”)? Answer: it 
happens when 


u(x) = 0 


for every linear functional u on V—and that must imply that x = 0 (see the 
discussion preceding Problem 74). Consequence: T is always a one-to-one 
mapping from V to V”. 

The only question that remains to be asked and answered is whether 
or not T maps V onto V”, and in the finite-dimensional case the answer is 
easily accessible. The range of T is a subspace of W’ (Problem 55); since T 
is an isomorphism from Y to ran T, the dimension of ran T is equal to the 
dimension of V. The dimension of V” is equal to the dimension of V also 
(because dim Y = dim V’ and dim V’ = dim VY”). A subspace of dimension 
n in a vector space of dimension n cannot be a proper subspace. Consequence: 
ran T = V”. Conclusion: the natural mapping of a finite-dimensional vector 
space to its double dual is an isomorphism, or, in other words, every finite- 
dimensional vector space is reflexive. 


Solution 79. 


Some proofs in mathematics require ingenuity, and others require nothing 
more than remembering and using the definitions—this one begins with a tiny 
inspiration and then finishes with the using-the-definitions kind of routine. 

Choose a basis {x1,22,...,%} for V so that its first m elements are 
in M (and therefore form a basis for M); let {u1, u2,...,Un} be the dual 
basis in V’ (see Solution 75). Since u;(x;) = 4;;, it follows that the u;’s with 
i > m annihilate M and with i < m do not. In other words, if the span of 
the u;’s with i > m is called N, then N c M°. 
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If, on the other hand, u is in M®, then, just because it is in the space Y’, 
the linear functional u is a linear combination of u;’s. Since any such linear 
combination 


u = Bu, + Boug +--+ Brun 


applied to one of the z;’s with j < m yields 0 (because x, is in M) and, at 
the same time, yields 6; (because u;(x;j) = 0 when i Æ J), it follows that the 
coefficients of the early u;’s are all 0. Consequence: u is a linear combination 
of the latter u;’s, or, in other words, u is in N, or, better still M? c N. 

The conclusions of the preceding two paragraphs imply that M? = N, 
and hence, since N has a basis of n — m elements, that 


dim M° =n — m. 


Solution 80. 


If the spaces V and V” are identified (as suggested by Problem 77), then, 
by definition, M° consists of the set of all those vectors x in M such that 
u(x) = 0 for all u in V’. Since, by the definition of V’, the equation u(x) = 0 
holds for all x in M and all u in M?, so that every x in M satisfies the 
condition just stated for belonging to M°°, it follows that M c M°°. If the 
dimension of Y is n and the dimension of M is m, then (see Problem 78) the 
dimension of M? is n — m, and therefore, by the same result, the dimension 
of M° is n — (n — m). In other words M is an m-dimensional subspace of 
an m-dimensional space M°°, and that implies that M and M°? must be the 
same. 


Solution 81. 


Suppose that A is a linear transformation on a finite-dimensional vector space 
V and A’ is its adjoint on V’. If u is an arbitrary vector in ker A’, so that 
A'u = 0, then, of course (A’u)(x) = 0 for every x in V, and consequently 
u( Ax) = 0 for every x in V. The latter equation says exactly that u takes 
the value 0 at every vector in the range of A, or, simpler said, that u belongs 
to (ran A)°. The argument is reversible: if u belongs to (ran A)°, so that 
u( Ax) = 0 for every x, then (A'u) (x) = 0 for every x, and therefore A'u = 0, 
or, simpler said, u belongs to ker A’. Conclusion: 


ker A’ = (ran A)°. 


It should not come as too much of a surprise that annihilators enter. The range 
and the kernel of A are subspaces of V, and the range and the kernel of A’ 
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are subspaces of V’/—what possible relations can there be between subspaces 
of V and subspaces of V’? The only known kind of relation (at least so far) 
has to do with annihilators. 

If A is replaced by A’ in the equation just derived, the result is 


ker A” = (ran A’)®, 


an equation that seems to give some information about ran A’—that’s 
good. The information is, however, indirect (via the annihilator), and it is 
expressed indirectly (in terms of A” instead of A). Both of these blemishes 
can be removed. If V” is identified with V (remember reflexivity), then A” 
becomes A, and if the annihilator of both sides of the resulting equation is 
formed (remember double annihilators), the result is 


ran A’ = (ker A)’. 


Question. Was finite-dimensionality needed in the argument? Sure: the sec- 
ond paragraph made use of reflexivity. What about the first paragraph —is 
finite-dimensionality needed there? 


Solution 82. 
What is obvious is that the adjoint of a projection is a projection. The rea- 
son is that projections are characterized by idempotence (Problem 71), and 
idempotence is inherited by adjoints. 

Problem 71 describes also what a projection is “on” and “along”: it says 
that if E is the projection on M along N, then 

N= ker E 

and 


M = ran E. 
It is a special case of the result of Solution 80 that 
ker E’ = (ran E)° 


and 


ran E’ = (ker E)’. 


Consequence: E’ is the projection on N° along M°. 
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Solution 83. 


Suppose that A is a linear transformation on a finite-dimensional vector space 
Y, with basis {e1,...,e1}, and consider its adjoint A’ on the dual space V’, 
with the dual basis {u1,..., Un}. What is wanted is to compare the expansion 
of each A'u in terms of the u’s with the expansion of each Ae in terms of 
the e’s. The choice of notation should exercise some alphabetic care; this is 
a typical case where subscript juggling cannot be avoided, and carelessness 
with letters can make them step on their own tails in a confusing manner. 

The beginning of the program is easy enough to describe: expand A’u,; 
in terms of the u’s, and compare the result with what happens when Ae; is 
expanded in terms of the e’s. The alphabetic care is needed to make sure that 
the “dummy variable” used in the summation is a harmless one—meaning, in 
particular, that it doesn’t collide with either j or i. Once that’s said, things 
begin to roll: write 


Au; = 5 Oy, Uk; 
k 
evaluate the result at each e;, and do what the notation almost seems to force: 


A'uj(ei) 2 Oy, (Uk ( ei) 2 4, Oki = = aj. 


All right—that gives an expression for the matrix entries Qij of A’; what is 


to be done next? Answer: recall the way the matrix entries are defined for A, 
and hope that the two expressions together give the desired information. That 
is: look at 


Ae; = > Qki€k, 
k 


apply each uj, and get 


u;(Ae;) = Uj ( y aner) = 2 Qkie KU; ( ek) =) Qkiekôj k = Qji. 
k 


Since 
uj(Ae:) = A'uj(ei), 
it follows that 


Qij = Aji 
for all ¿ and j. Victory: that’s a good answer. It says that the matrix entries of 
A’ are the same as the matrix entries of A with the subscripts interchanged. 


Equivalently: the matrix of A’ is the same as the matrix of A with the rows 
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and columns interchanged. Still better (and this is the most popular point of 
view): the matrix of A’ is obtained from the matrix of A by flipping it over 
the main diagonal. In the customary technical language: the matrix of A’ is 
the transpose of the matrix of A. 


Chapter 6. Similarity 


Solution 84. 


The interesting and useful feature of the relation between x and y is the 
answer to this question: how does one go from x to y? To “go” means (and 
that shouldn’t be a surprise) to apply a linear transformation. The natural way 
to go that offers itself is the unique linear transformation T determined by 
the equations 


Tx, = yi, 


TEn = Yn. 
The linear transformation T has the property that Tx = y; indeed 
Tx = T(aizı +--+ azn) 
= aT zı +--+ aT tn 
= yi Fess F O1Yn = Y- 


The answer to the original question, expressed in terms of T, is therefore 
simply this: the relation between x and y is that Tx = y. That is: a “change 
of basis” is effected by the linear transformation that changes one basis to 
another. 


Question. Is T invertible? 


Solution 85. 


The present question compares with the one in Problem 83 the way matrices 
compare with linear transformations. The useful step in the solution of Prob- 
lem 83 was to introduce the linear transformation T that sends the x’s to the 
y’s. Question: what is the matrix of that transformation (with respect to the 
basis (x1, . . ., £n))? The answer is obtained (see Solution 59) by applying T 
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to each x; and expanding the result in terms of the x’s. If 
Yj = Tx; = X aiti, 
i 
then 
J J i i J 


Since, however, by assumption 
dS =) nv, 
i j 
it follows that 
& = J oim 
j 


That’s the answer: the relation that the question asks for is that the €’s can be 
calculated from the n’s by an application of the matrix (a;,;). Equivalently: a 
change of basis is effected by the matrix that changes one coordinate system 
to another. 


Solution 86. 


The effective tool that solves the problem is the same linear transformation T 
that played an important role in Solutions 84 and 85, the one that sends the 
x’s to the y’s. If, that is, 


Tej=yy (G=1,---,n) 


then 
Cy; = CT; 
and 
Cy; = os QijYi = 5 OT ti =T (>: oun) = TBa;. 
Consequence: 


CT; = TB; 
for all j, so that 


CT =TB. 
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That’s an acceptable answer to the question, but the usual formulation 
of the answer is slightly different. Solution 84 ended by a teasing puzzle 
that asked whether T is invertible. The answer is obviously yes—T sends 
a basis (namely the x’s) onto a basis (namely the y’s), and that guarantees 
invertibility. In view of that fact, the relation between C and B can be written 
in the form 


C=TBT", 


and that is the usual equation that describes the similarity of B and C. 

The last phrase requires a bit more explanation. What it is intended to 
convey is that B and C are similar if and only if there exists an invertible 
transformation T such that C = TBT~'. The argument so far has proved 
only the “only if’. The other direction, the statement that if an invertible T of 
the sort described exists, then B and C are indeed similar, is not immediately 
obvious but it is pretty easy. It is to be proved that if T exists, then B and 
C do indeed correspond to the same matrix via two different bases. All right; 
assume the existence of a T, write B as a matrix in terms of an arbitrary 
basis {11,...,2%n}, so that 


Bu; = ò Qijti, 
i 


define a bunch of vectors y by writing Tz; = y; (j = 1,..., n), and then 
compute as follows: 


Cy; = CTzr; = TBzjį 5 QijT zi = DD QijYi. 


Conclusion: the matrix of B with respect to the x’s is the same as the matrix 
of C with respect to the y’s. 


Solution 87. 


Some notation needs to be set up. The assumption is that one linear transfor- 
mation is given, call it B, and two bases 


in ee and {YU tseiey tn be 


Each basis can be used to express B as a matrix, 
Bu; = X bizi and By; = XO ijti 
i i 


and the question is about the relation between the 8’s and the y’s. 
The transformation T that has been helpful in the preceding three prob- 
lems (Tx; = yj) can still be helpful, but this time (because temporarily 
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matrices are at the center of the stage, not linear transformations), it is advis- 
able to express its action in matrix language. The matrix of T with respect to 
the x’s is defined by 


Trj = > Tkjťk, 
k 


and now the time has come to compute with it. Here goes: 


By; = BTzj = BY Tes 
k 
E X ej Bre 
k 
=X eg > Pikti 
k i 


and 
By; = X YrjYk 
k 
= > Veg LL 5 Ykj 5 Tiki 
k k i 
= 5 (= nes Ti. 
i k 
Consequence: 


X Tik kj = X BikTkj- 
k k 


In an abbreviated but self-explanatory form the last equation asserts a relation 
between the matrices 3 and x, namely that 


TY = Br. 
The invertibility of 7 permits this to be expressed in the form 
y= Br, 


and, once again, the word similarity can be used: the matrices 8 and y are 
similar. 
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Solution 88. 


Yes, it helps to know that at least one of B and C is invertible; in that case 
the answer is yes. If, for instance, B is invertible, then 


BC = BC(BB-) =B(CB)B"; 


the argument in case C is invertible is a trivial modification of this one. 
If neither B nor C is invertible, the conclusion is false. Example: if 


1 0 0 1 
a= ( ) and gai ae 


BC=C and CB=0. 


then 


The important part of this conclusion is that BC 4 0 but CB = 0. 


Comment. There is an analytic kind of argument, usually frowned upon 
as being foreign in spirit to pure algebra, that can sometimes be used to 
pass from information about invertible transformations to information about 
arbitrary ones. An example would be this: if B is invertible, then BC and 
CB are similar; if B is not invertible then it is the limit (here is the analysis) 
of a sequence {B,,} of invertible transformations. Since B,,C is similar to 
C Bn, it follows (?) by passage to the limit that BC is similar to CB. 

The argument is phony of course—where does it break down? What is 
true is that there exist invertible transformations T, such that 


(Br, C)Tn = Tr(CBn), 


and what the argument tacitly assumes is that the sequence of T’’s (or possibly 
a subsequence) converges to an invertible limit T. If that were true, then it 
would indeed follow that (BC)T = T(BC)—hence, as the proof above 
implies, that cannot always be true. Here is a concrete example for the B and 
C mentioned above: if 


then indeed 
(B,C)T, = Tr(CB,) 


for all n, and the sequence {T7 +} converges all right, but its limit refuses to 
be invertible. 
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Solution 89. 


Yes, two real matrices that are complex similar are also real similar. Suppose, 
indeed, that A and B are real and that 


SA = B5, 


where S is an invertible complex matrix. Write S' in terms of its real and 
imaginary parts, 


S=P+iQ. 


Since PA+iQA = BP +iBQ, and since A, B, P, and Q are all real, it 
follows that 


PA=BP and QA = BQ. 


The problem might already be solved at this stage; it is solved if either 
P or Q is invertible, because in that case the preceding equations imply that 
A and B are “real similar”. Even if P and Q are not invertible, however, the 
solution is not far away. 

Consider the polynomial 


p(A) = det(P + AQ). 


Since p(t) = det(P + iQ) 4 0 (because S is invertible), the polynomial p is 
not identically 0. It follows that the equation p(A) = 0 can have only a finite 
number of roots and hence that there exists a real number A such that the real 
matrix P + AQ is invertible. That does it: since 


(P +AQ)A = PA+QA = BP + ABQ = B(P +AQ), 


the matrices A and B are similar over the field of real numbers. 

The computation in this elementary proof is surely mild, but it’s there just 
the same. An alternative proof involves no computation at all, but it is much 
less elementary; it depends on the non-elementary concept of “elementary di- 
visors”. They are polynomials associated with a matrix; their exact definition 
is not important at the moment. What is important is that their coefficients 
are in whatever field the entries of the matrix belong to, and that two matrices 
are similar if and only if they have the same elementary divisors. Once these 
two statements are granted, the proof is finished: if A and B are real matrices 
that are similar (over whatever field happens to be under consideration, pro- 
vided only that it contains the entries of A and B), then they have the same 
elementary divisors, and therefore they must be similar over every possible 
field that contains their entries—in particular over the field of real numbers. 


SOLUTIONS: Chapter 6 263 


Solution 90. 
Since 
ker A’ = (ran A)? 
(by Problem 80) and since 
dim(ran A)? = n — dim ran A 
(by Problem 65), it follows immediately that 
null A’ = n — rank A, 


Suppose now that {21,...,@m} is a basis for ker A and extend it to a 
basis {£1,. . ., £m, Em+1, - - -, Zn} of the entire space V. If 


£ = AX, Hie + OAmEm + Qm+1Tm+1 +e H Anty 
is an arbitrary vector in V, then 
AX = Am+1 ÁÅ£m41 +++: + AnALn, 


which implies that ran A is spanned by the set {Azm41,..., ALn}. Conse- 
quence: 


dimran A Sn- m, 
or, in other words, 
rank A < n — null A. 


Apply the latter result to A’, and make use of equation above connecting 
null A’ and rank A to get 


rank A’ < rank A. 


That almost settles everything. Indeed: apply it to A’ in place of A to get 


rank A” < rank A’; 


in view of the customary identification of A” and A, the last two inequalities 
together imply that 


rank A = rank A’. 
Consequence: 


null A’ = n — rank A’, 
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and that equation, with A’ in place of A (and the same identification argument 
as just above) yields 


rank A+ null A =n. 


The answer to the first question of the problem as stated is that if 
rank A = r, then there is only one possible value of rank A’, namely the 
same r. The answer to the second question is that there is only one possible 
value of null A, namely n — r. 


Comment. A special case of the principal result above (rank + nullity = 
dimension) is obvious: if rank A = 0 (which means exactly that ran A = O), 
then A must be the transformation 0, and therefore null A must be n. A 
different special case, not quite that trivial, has already appeared in this book, 
in Problem 65. The theorem there says, in effect, that the nullity of a linear 
transformation on a space of dimension n is 0 if and only if its rank is n—and 
that says exactly that the sum formula is true in case null A = 0. 


Solution 91. 


Yes, similar transformations have the same rank. Suppose, indeed, that B and 
C are linear transformations and T is an invertible linear transformation such 
that 


CT =TB. 
If y is a vector in ran B, so that y = Bx for some vector x, then the equation 
CTx = Ty 


implies that Ty is in ran C, and hence that y belongs to T~! (ran C). What 
this argument proves is that 


ran B C T~! (ran C). 


Since the invertibility of T implies that T~! (ran C') has the same dimension 
as ran C, it follows that 


rank B < rank C. 


The proof can be completed by a lighthearted call on symmetry. The assump- 
tion that B and C are similar is symmetric in B and C; if that assumption 
implies that rank B < rank C, then it must also imply that 


rank C < rank B, 
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and from the two inequalities together it follows that 


rank B = rank C. 


Solution 92. 


The answer is yes, every 2 x 2 matrix is similar to its transpose, and a 
surprisingly simple computation provides most of the proof: 


GaG s= m) a a) 
0 B/\y7 6 By pë B 5) \0 8j’ 
If neither 8 nor y is 0, then (; a is invertible and that’s that: e B ) 


ô 


i 


ô 
If 8 = 0 and y £0, the proof is still easy, but it is not quite so near the 


surface. If worse comes to worst, computation is bound to reveal it: just set 


(EG s)=(o a) E a) 


and solve the implied system of four equations in four unknowns. It is 
of course not enough just to find numbers £, 7, Ç, and 0 that satisfy the 
equations—for instance € = 7 = Ç = 0 = 0 always works—it is necessary 
also to find them so that the matrix is invertible. One possible solution is 
indicated by the equation 


e R JaN a) = (6 ale en) 
ay 


G5 
The case in which y = 0 and 8 ¥ 0, that is, the problem of the similarity 


of 
a B a 0 
(os) om (Gs), 


is the same as the one just discussed: just replace 8 by y and interchange the 
order in which the matrices were written. 

(Does the assertion that similarity is a symmetric relation deserve explicit 
mention? If B and C are similar, via T, that is if 


is indeed similar to 


That works (meaning that G 2) is invertible) because y Æ 0. 


CT =TB, 
then 


TCT aT ae. 
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Replace TT! and T~!T by I and interchange the two sides of the equation 
to get 


BT 1=T"'C, 
and that is exactly the similarity of C and B via T~1.) 
If both 8 and y are 0, the matrix degenerates to a diagonal one, 


and the question of similarity to its transpose degenerates to a triviality—it is 
equal to its transpose. 


Comment. That settles 2 x 2 matrices; what happens with matrices of size 3 
and greater? The answer is that the same result is true for every size—every 
matrix is similar to its transpose—but even for 3 x 3 matrices the problem 
of generalizing the computations of the 2 x 2 case becomes formidable. New 
ideas are needed, more sophisticated methods are needed. They exist, but they 
will come only later. 


Solution 93. 


The answer is 
rank(A + B) < rank A + rank B. 


For the proof, observe first that 


ran A + ran B 


(in the sense of sums of subspaces, defined in Problem 28) consists of all 
vectors of the form 


Ax + By, 
and that 
ran(A + B) 
consists of all vectors of the form 
Az+ Bz. 
Consequence: 
ran(A + B) C ran A + ran B, 


and, as a consequence of that, 


rank(A + B) < dim(ran A + ran B). 
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How is the right side of this inequality related to 


dim ran A + dim ran B? 


In general: if M and N are subspaces, what is the relation between 


dim(M + N) and dim M + dim N? 


The answer is a natural one to guess and an easy one to prove, as follows. 


If {21,...,@m} is a basis for M and {y1,..., yn} is a basis for N, then the 
set 


{£1,--., Em Y1; -<-> Yn} 


is surely big enough to span M + N. Consequence: the dimension of M + N 
is not more than m + n, or, in other words, 


dim(M + N) < dim M + dimN, 


The proof of the rank sum inequality is complete. 


Solution 94. 
Since (AB)x = A(Bz), it follows that 
ran AB Cran A, 
and hence that 
rank AB S rank A. 


Words are more useful here than formulas: what was just proved is that the 
rank of a product is less than or equal to the rank of the left-hand factor. That 
formulation implies that 


rank(B’A’) < rank B’. 


Since, however, 
rank(B’ A’) = rank((AB)’) = rank AB, 
and 


rank B’ = rank B 


(Problem 88), it follows that rank AB is less than or equal to both rank A 
and rank B, and hence that 


rank AB < min{rank A, rank B}. 
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That’s it; that’s the good relation between the rank of a product and the ranks 
of its factors. 


Comment. If B happens to be invertible, so that rank B is equal to the 
dimension of the space, then the result just proved implies that 


rank AB < rank A 
and at the same time that 
rank A = rank(AB)B7' < rank AB, 
so that, in fact, 
rank AB = rank A. 
It follows that 
rank(BA) = rank(BA)! = rank(A’B’) = rank A’ = rank A. 


In sum: the product of a given transformation with an invertible one (in either 
order) always has the same rank as the given one. 


Solution 95. 


The range of a transformation A is the image under A of the entire space V, 
and its dimension is an old friend by now—that’s just the rank. What can 
be said about the dimension of the image of a proper subspace of V? The 
question is pertinent because 


ran(AB) = (AB)V = A(BV) = A(ran B), 
so that 
rank(AB) = dim(A(ran B)). 


If M is a subspace of dimension m, say, and if N is any complement of 
M, so that 


V=M-4N, 
then 
ran A= AV = AM + AN. 
It follows that 


rank A < dim(AM) + dim(AN) < dim(AM) + dim(N) 
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(because the application of a linear transformation can never increase dimen- 
sion), and hence that 


n—nullA < dim(AM) +n-—m 
(where n = dim Y, of course). If in particular 
M = ran B, 

then the last inequality implies that 

rank B — null A < rank(AB), 
or, equivalently, that 

n — null A — null B < n — null( AB). 

Conclusion: 

null(AB) < null A + null B. 


Together the two inequalities about products, namely the one just proved about 
nullity and the one (Problem 89) about rank, 


rank AB < min{rank A, rank B}, 


are known as Sylvester’s law of nullity. 


Solution 96. 


(a) The “natural” basis vectors 
eı = (1,0,0), e= (0,1,0), e3 = (0,0,1) 
have a curious and special relation to the transformation C: it happens that 
Cei = e1, Ceg =2e2, Ce3 = 3e3. 
If B and C were similar, 
CT =TB, 


or, equivalently, 


then the vectors 
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would have the same relation to B: 
BT~*e; =T1Ce; = T (je) = jT 1; 


for 7 = 1,2,3. Is that possible? Are there any vectors that are so related to 
B? 

There is no difficulty about j = 1: the vector fı can be chosen to be the 
same as e1. What about f2? Well, that’s not hard either. Since 


Bey=e,, Beg =e, + 2e2, 
it follows that 
Bie, + e2) = e1 + (e1 + 2e2) = 2(e1 + 2); 
in other words if fo = e1 + e2, then 
Bfz = 2 fo. 
Is this the beginning of a machine? Yes, it is. Since 
Bez =e, + e2 +63, 


it follows that 
Bie, + e2 +e3) = Bley + e2) + Bez 


= 2(e1 + e2) + (e1 + €2 + 3e3) = 3(e1 + e2 + e3); 
to get 
Bfz = 3fs, 
just set 
fs =e1 + e2 +63. 


What good does all that do? Answer: it proves that B and C are similar. 
Indeed: the vectors fi, fo, f3, expressed in coordinate forms in terms of the 
e’s as 


fı = (1,0, 0) 
f2 = (1,1,0) 
fs z (1,1,1), 


constitute a basis. The matrix of B with respect to that basis is the matrix 
C. Isn’t that clear from the definition of the matrix of B with respect to the 
basis { f1, fo, f3}? What that phrase means is this: form Bf; (for each j), 
and express it as a linear combination of the f’s; the resulting coefficients 
are the entries of column number j. Conclusion: B and C are similar. 
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(b) The reasoning here is similar to the one used above. The linear trans- 
formation C’ has the property that 


Ce, =0, Ceg=e1, Ces = ez. 
If B and C are similar, 
BT! = TG, 
then the vectors 
fi=T e; (j =1,2,3) 
are such that B fı = 0, and, for j > 0, 
Bf; = BT ~'e; = T Ce; = T'ej-1 = fj-1. 


At this moment it may not be known that B and C are similar, but it makes 
sense to ask whether there exist vectors fi, fo, fs that B treats in the way 
just described. 

Yes, such vectors exist, and the proof is not difficult. Just set fı equal to 
e1, set fz equal to e2, and then start looking for f3. Since Begs = e1 + e2, it 
follows that 


B(e3 — e2) = (e1 + €2) — €1 = €2; 
in other words, if f3 = e3 — e2, then 


Bf = fa. 


Once that’s done, the problem is solved. The vectors fi, fo, f3, expressed in 
coordinate forms in terms of the e’s as 


fı = (1,0, 0) 
fo = (0, 1,0) 
fs = (0, —1,1), 


constitute a basis, and the matrix of B with respect to that basis is equal 
to C. 

(c) The most plausible answer to both (a) and (b) is no—how could a 
similarity kill all the entries above the diagonal? Once, however, the answers 
have been shown to be yes, most people approaching (c) would probably be 
ready to guess yes—but this time the answer is no. 

What is obvious is that 


Ce, = 2e1, Cez = 3€2, Ce3 = 3e3. 
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What is one millimeter less obvious is that every linear combination of e2 
and e3 is mapped onto 3 times itself by C. What must therefore be asked (in 
view of the technique established in (a) ) is whether or not there exist vectors 
fi, fa, fg that B treats the way C treats the e’s. The answer turns out to be 
no. 

Suppose indeed that Bf = 3f, where f is a vector whose coordinate 
form in terms of the e’s is, say (a1, @2, a3). Since 


Bf = (2a; + a2 + a3, 302 + a3, 303), 


the only way that can be equal to 


3f = (3a, 3Q2, 3a3), 


is to have a3 = 0 (look at the second coordinates). From that in turn it follows 
that a2 = a, (look at the first coordinates). To sum up: f must look like 
(T, T, 0), or simpler said, every solution of the vector equation Bf = 3f is of 
the form (7,7, 0). Consequence: the set of solutions of that vector equation 
is a subspace of dimension 1, not 2. For C the corresponding dimension was 
2, and that distinction settles the argument—B and C cannot be similar. 

(d) In view of all this, what would a reasonable person guess about (d) 
by now? Is it imaginable that a similarity can double a linear transformation? 

Yes, it is. The action of B on the natural basis {e1, e2,e3} can be de- 
scribed this way: the first basis vector is killed, and the other two are shifted 
backward to their predecessors. The question is this: is there a basis such that 
the first of its vectors is killed and each of the others is shifted backward to 
twice its predecessor? In that form the answer is easy to see: put 


fı = (1,0, 0) 
fo = (0, 2,0) 
fs = (0,0, 4). 


That solves the problem, and nothing more needs to be said, but it might 
be illuminating to see a linear transformation that sends the e’s to the f’s 
and, therefore, actually transforms B into C. That’s not hard: if 


sl CO © 


then 
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and painless matrix multiplication proves that TBT~! = 2B. 

(e) The matrix of B with respect to the natural basis {e1, e2, e3} is the 
one exhibited in the question; what is the matrix of B with respect to the 
basis given by 


fi = 63, 
fe = C2, 
fs = e1? 


The answer is as easy as any matrix determination can ever be. Since 
Bfi = Bez = e1 + e2 + e3 = fs + f2 + fi 
Bf = Bez =e, + e2 = f3 + fo 
Bf; = Be, = fs, 


it follows that the matrix of B with respect to the f’s is exactly C. 
Note that C is the transpose of B; compare with Problem 92. 


Solution 97. 


Define a linear transformation P by 
PE; = Jj (j =1,...,n) 


and compute: 
Cy = >) ast = JayP: = P) oè: = PBa. 


The result almost forces the next step: to make the two extreme terms of 
this chain of equalities comparable, it is desirable to introduce the linear 
transformation Q for which 


zj = Qy; (j =1,...,n). 
The result is that Cy; = PBQy, for all j, and hence that 


C = PBQ. 


That’s the answer: B and C are equivalent if and only if there exist 
invertible transformations P and Q such that C = PBQ. The argument just 
given proves “only if’, but, just as for Problem 85, the “if’ must be proved 
too. The proof is a routine imitation of the proof in Problem 85, except it is 
not quite obvious how to set up the notation: what can be chosen arbitrarily 
to begin with and what should be defined in terms of it? Here is one way to 
answer those questions. 
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Assume that C = PBQ, and choose, arbitrarily, two bases 


{%1,...,£n} and Tiesin 
Write 
Tj = Qyj and Yj = Pri 


and write B as a matrix with respect to the x’s and ?’s: 
Bu; = 5 Oj Xi. 
i 
It follows that 


Cy; = PBQy; = PBx; = PS aijîi = X ayPî: = X Ove 


Comparison of the last two displayed equations shows the matrix C (y, Y) 
of C with respect to the y’s is the same as the matrix B(X, X) of B with 
respect to the x’s. 


Question. If B is equivalent to C, does it follow that B? is equivalent to 
C?? The first attempt at answering the question, without using the following 
problem, is not certain to be successful. 


Solution 98. 


Suppose that A is a linear transformation of rank r, say, on a finite-dimen- 
sional vector space V. Since the kernel of A is a subspace of dimension n—r, 
standard techniques of extending bases show that there exists a basis 


Ty 0005 Er, r41,- --; Tn 
of V such that {2,41,...,@n} is a basis for ker A. Assertion: the vectors 
Yı = AX1,.--,Yr = At, 


are linearly independent. Indeed: the only way it can happen that 


yy Ary +: +YrATr = 0, 


is to have 7,41 +---+ 7,2, in the kernel of A. Reason: since the x’s form 
a basis for V, the only way a linear combination of the first r of them can 
be equal to a linear combination of the last n — r, is to have all coefficients 
equal to 0. 


SOLUTIONS: Chapter 7 275 


Once that is known, then, of course, the set {y1,..., Yr} can be extended 
to a basis 


Yis- -+3 Yrs Ur+ts ++ +s Yn 


of V. What is the matrix of A with respect to the pair of bases (the x’s and 
the y’s) under consideration here? Answer: it is the n x n diagonal matrix 
the first r of whose diagonal terms are 1’s and the last n — r are 0’s. 

That remarkable conclusion should come as a surprise. It implies that 
every matrix of rankr is equivalent to a projection of rank r, and hence that 
any two matrices of rankr are equivalent to one another. 


Chapter 7. Canonical Forms 


Solution 99. 


If E is a projection, and if À is an eigenvalue of Æ with eigenvector x, so 
that Ex = Ax, then 


Ex = E’g = E(Ex) = E(x) = \Ex = \(Az) = 2. 


Since \x = A?x and x # 0 (by the definition of eigenvector), it follows 
that A = A. Consequence: the only possible eigenvalues of E are 0 and 1. 
Since the roots of the characteristic equation are exactly the eigenvalues, it 
follows that the only possible factors of the characteristic polynomial can be 
A and 1— À, and hence that the characteristic polynomial must be of the form 
AF (A — 1)°-*, with k =0,1,...,n. 


Question. If rank E = 0 (that is, if E = 0), then k = n; if rank E = n 
(that is, if Æ = 1), then k = 0; what is k for other values of rank E? 


Solution 100. 
The sum of the roots of a (monic) polynomial equation 
A” HaT H-an anO 


is equal to —a, and the product of the roots is equal to plus or minus a, 
(depending on whether n is even or odd). To become convinced of these 
statements, just write the polynomial in factored form 


(A = Ar) + (A= An): 
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It follows that the sum and the product of the eigenvalues of a matrix A 
belong to the field in which the entries of A lie. 

The product of the eigenvalues is equal to the determinant (think about 
triangularization), and that observation yields an alternative proof of the as- 
sertion about the product. The sum of the eigenvalues is also easy to read off 
the matrix A: it is the coefficient of (—A)"~+ in the expansion of det(A—)), 
and hence it is the sum of the diagonal entries. That sum has a name: it is 
called the trace of the matrix A. 

The answer to the question as it was asked is a strong NO: even though 
the eigenvalues of a rational matrix can be irrational, their sum and their 
product must be rational. 


Solution 101. 


The answer is yes: AB and BA always have the same eigenvalues. 

It is to be proved that if A 4 0 and AB — A is not invertible, then neither 
is BA—,, or, contrapositively, that if AB—A is invertible, then so is BA— À. 
Change signs (that is surely harmless), divide by A, and then replace A, say, 
by —. These manipulations reduce the problem to proving that if 1— AB is 
invertible, then so is 1 — BA. 

At this point it turns out to be clever to do something silly. Pretend that 


the classical infinite series formula for 


is applicable, 


Le 
1 
l= 
and apply it, as a matter of purely formal juggling, to BA in place of x. The 
result is 


=1+r +r +r, 


(1— BA) =1+BA+ BABA + BABABA+-::: 
=1+B(1+AB+ABAB+---)A 
=1+B(1-— AB)'A. 


Granted that this is all meaningless, it suggests that, maybe, if 1 — AB is 
invertible, with, say 


(1- AB)" =X, 
then 1 — BA is also invertible, with 
(1— BA) =1+ BXA. 


Now that statement may be true or false, but it is in any case meaningful—it 
can be checked. Assume, that is, that (1 — AB)X = 1 (which can be written 
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as ABX = X — 1) and calculate: 
(1— BA)(1+ BXA) = (1 + BXA) — BA(1 + BXA) 
=1+ BXA- BA- BABXA 
=1+ BXA- BA- B(X -1)A 
=1+BXA- BA- BXA+BA=l1. 


Victory ! 


Solution 102. 


If À is an eigenvalue of A with eigenvector x, 
Ax = rx, 
then 
A’x = A(Az) = A(Ar) = (Az) = X(Ax) = A? 2, 
and, by an obvious inductive repetition of this argument, 
A°™x = Xr"x 


for every positive integer n. (For the integer 0 the equation is if possible 
even truer.) This in effect answers the question about monomials. A linear 
combination of a finite number of these true equations yields a true equation. 
That statement is a statement about polynomials in general: it says that if p 
is a polynomial, then p(A) is an eigenvalue of p(A). 


Solution 103. 


Since the matrix of A is 


the characteristic equation is 


Since 


B-1= (A-1)(17 -A+1), 
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it follows that the roots of the characteristic equation are the three cube roots 
of unity: 
1 Es d eyes 
w= +a an w= —- 2 — =V3. 
‘ 2 2 2 2 


The corresponding eigenvectors are easy to calculate (and even easier to guess 
and to verify); they are 


u= (1,1,1), v = (1,w,w?), and w= (1,w*,w). 


Comment. Itis easy and worth while to generalize the question to dimensions 
greater than 3 and to permutations more complicated than the simple cyclic 
permutation that sends (1,2,3) to (2,3,1). The most primitive instance of 
this kind occurs in dimension 2. The eigenvalues of the transformation A 
defined on C? by 


A(x, x2) = (xa, zı) 


are of course the eigenvalues 


of the matrix 


with corresponding eigenvectors 
(1,1) and (1,—1). 


These two vectors constitute a basis; the matrix of A with respect to that basis 


(0 4): 


The discussion of the 3 x 3 matrix A of the problem can also be regarded 
as solving a diagonalization problem; its result is that that matrix is similar 
to 


1S 


0 
0 
iy? 


1 0 
0 w 
0 0 
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The next higher dimension, n = 4, is of interest. There the matrix be- 
comes 


HE O © O 
D D OF 
D On D 
O.o o 


Its characteristic equation is 

At = 1, 
and, therefore, its eigenvalues are the fourth roots of unity: 
—1, —i. 


Consequence: the diagonalized version of the 4 x 4 matrix is 


1 0 0 0 
0i 0 0 
0 0 -1 0 
0 0 0 -i 
Solution 104. 104 


Yes, every eigenvalue of p(A) is of the form p(X) for some eigenvalue À of 
A. Indeed, if u is an eigenvalue of p(A), consider the polynomial p(X) — u. 
By the fundamental theorem of algebra that polynomial can be factored into 
linear pieces. If 


pA) — w= (A= A1) ++ (A= An), 
then 
p(A) -u = (A—Ai)-+- (A= àn). 


The assumption about u implies that there exists a non-zero vector x such 
that p(A)a = wx, and from that it follows that 


(A — à1) (A — àn) = 0. 
Consequence: the product of the linear transformations 
(A — à1),..., (A — An) 


is not invertible, and from that it follows that A — A; is not invertible for 
at least one j. That means that A; is an eigenvalue of A for at least one j. 
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Conclusion: since 


the eigenvalue u does indeed have the form p(X) for some eigenvalue À of 
A (namely A = Aj). 


Comment. The set of all eigenvalues of a linear transformation A on a finite- 
dimensional vector space is called the spectrum of A and is often referred to 
by the abbreviation spec A. With that notation Solution 102 can be expressed 
by saying that if A is a linear transformation, then 


p(spec A) C spec(p(A)), 


and the present solution strengthens that to 


p(spec A) = spec(p(A)). 


Another comment deserves to be made, one about the factorization tech- 
nique used in the proof above. Is spec(A) always non-empty? That is: does 
every linear transformation on a finite-dimensional vector space have an eigen- 
value? The answer is yes, of course, and the shortest proof of the answer (the 
one used till now) uses determinants (the characteristic equation of A). The 
factorization technique provides an alternative proof, one without determi- 
nants, as follows. 

Given A, on a space of dimension n, take any non-zero vector x and 
form the vectors 


xz, Ax, A’a,..., A”T. 


Since there are n + 1 of them, they cannot be linearly independent. It follows 
that there exist scalars ag, @1,@2,..., Qn, not all 0, such that 


aor + a A£ + az A’ £ +- +anA”r =0. 


It simplifies the notation and it loses no information to assume that if k 
is the largest index for which a, 4 0, then, in fact, a, = 1—just divide 
through by ax. A different language for saying what the preceding equa- 
tion says (together with the normalization of the preceding sentence) is to 
say that there exists a monic polynomial p of degree less than or equal 
to n such that p(A)x = 0. Apply the fundamental theorem of algebra to 
factor p: 


p(A) = (A — à1) e (A — àx). 
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Since p(A)x = 0 it is possible to reason as in the solution above to infer 
that A — A, is not invertible for at least one j—and there it is!—A, is an 
eigenvalue of A. 

The fundamental theorem of algebra is one of the deepest and most useful 
facts of mathematics—its repeated use in linear algebra should not come as a 
surprise. The need to use it is what makes it necessary to work with complex 
numbers instead of only real ones. 


Solution 105. 


The polynomials 1, x, x 
to that basis is 


2 «° form a basis of P3; the matrix of D with respect 


ooo 
ooo E 
oOoONnN oO 
w o o 


0 0 


Consequences: the only eigenvalue of D is 0, and the characteristic polynomial 
of D is àt. The algebraic multiplicity of the eigenvalue 0 is the exponent 4. 
What about the geometric multiplicity? The question is about the solutions p 
of the equation 


Dp = 0; 


in other words, the question is about the most trivial possible differential 
equation. Since the only functions (polynomials) whose derivative is 0 are the 
constants, the geometric multiplicity of 0 (the dimension of the eigenspace 
corresponding to 0) is 1. 


Solution 106. 


The answer is good: every transformation on an n-dimensional space with n 
distinct eigenvalues is diagonalizable. 

Suppose, to begin with, that n = 2. If A is a linear transformation on a 
2-dimensional space, with distinct eigenvalues A; and Az and corresponding 
eigenvectors xı and x2, then (surprise?) xı and x are linearly independent. 
Reason: if 


Q12, + &2£2 = 0, 


apply A — A, to that equation. Since A — A; kills xı, the result is 


azl Àz = Ai) x2 = 0, 
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and since Az — A; # 0 (assumption) and x2 Æ 0 (eigenvector), it follows that 
a2 = 0. That in turn implies that a; = 0, or, alternatively, an application of 
A — Az to the assumed equation yields the same conclusion. 

If n = 3, and if the three distinct eigenvalues in question are 41, 2, and 
A3, with eigenvectors £1, £2, and x3, the same conclusion holds: the x’s are 
linearly independent. Reason: if 


a12, + Aer +03%3 = 0, 
apply A — , to infer 
a2(A2Q — A1)&2 + a3(A3 — A1)x3 = 0, 
and then apply A — A2 to infer 
a3(A3 = A1)(A3 = A2)x3 = 0. 


That implies a3 = 0 (because (A3 — A1)(A3 — A2) Æ 0 and x3 4 0). Continue 
the same way: apply first A — \, and then A — A3 to get ag = 0, and by 
obvious small modifications of these steps get a, = 0. 

The general case, for an arbitrary n, should now be obvious, and from it 
diagonalization follows. Indeed, once it is known that a transformation on an 
n-dimensional space has n linearly independent eigenvectors, then its matrix 
with respect to the basis those vectors form is diagonal—and in this last step 
it no longer even matters that the eigenvalues are distinct. 


Comment. Here is a minute but enchanting corollary of the result: a 2 x 2 
real matrix with negative determinant is diagonalizable. Reason: since the 
characteristic polynomial is a quadratic real polynomial with leading coeff- 
cient 1 and negative constant term, the quadratic formula implies that the two 
eigenvalues are distinct. 


Solution 107. 


Suppose that A is a linear transformation on a finite-dimensional vector space 
and that Ao is one of its eigenvalues with eigenspace Mo. If x belongs to Mo, 


Ax = dor, 
then 


—which says that Ax belongs to Mo. In other words, the subspace Mo is 
invariant under A. If Ap is the linear transformation A considered on Mo 
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only (the restriction of A to Mo), then the polynomial det(Ao — A) is a 
factor of the polynomial det(A — A). ( Why?). If the dimension of Mo 
(= the geometric multiplicity of Ao) is mo, then 


det (Ay — A) = (Ao — A), 


and it follows that (Ag — A) occurs as a factor of det(A — A) with an expo- 
nent m greater than or equal to mo. That’s it: the assertion m 2 mo says 
exactly that geometric multiplicity is always less than or equal to algebraic 
multiplicity. 


Comment. What can be said about a transformation for which the algebraic 
multiplicity of every eigenvalue is equal to 1? In view of the present result the 
answer is that the geometric multiplicity of every eigenvalue is equal to 1, and 
hence that the number of eigenvalues is equal to the dimension. Conclusion 
(see Problem 106): the matrix is diagonalizable. 


Solution 108. 
The calculation of the characteristic polynomials is easy enough: 
det(A — A) = (1 — A)(2— A)(6 — A) +3 + 6(1 — A) + (6 — A) 


and 


det(B — A) = (1 — A)(5 — A)(3 — A) + 4(3 — A). 
These both work out to 


A3 — 9d? + 27d — 27, 
which is equal to 
(A—3)%. 


It follows that both A and B have only one eigenvalue, namely \ = 3, of 
algebraic multiplicity 3, and, on the evidence so far available, it is possible 
to guess that A and B are similar. 

What are the eigenvectors of A? To have Au = 3u, where u = (a, B, Y), 
means having 


108 


284 LINEAR ALGEBRA PROBLEM BOOK 


These equations are easy to solve; it turns out that the only solutions are the 
vector u = (1, 2,3) and its scalar multiples. Consequence: the eigenvalue 3 
of A has geometric multiplicity 1. 

For B the corresponding equations are 


a+ £B = 3a 
—4a+56+ =36 
—6a — 38+ 37 = 34. 


The eigenspace of the eigenvalue 3 is 2-dimensional this time; it is the set of 
all vectors of the form (a, 2a, œ). Consequence: the eigenvalue 3 of B has 
geometric multiplicity 2. Partial conclusion: A and B are not similar. 

The upper triangular form of both A and B must be something like 


3 0a y 
0 3 B 
0 0 3 


Even a little experience with similarity indicates that that form is not 
uniquely determined—the discussion of Problem 94 shows that similarity can 
effect radical changes in the stuff above the diagonal (see in particular part 
(b)). Here is a pertinent special example that fairly illustrates the general case: 


1 -1 0 3 1 1 1 1 0 3 1 0 
0 1 0 0 3 1 0 1 0|=|0 3 1 
0 0 1 0 0 3 0 0 1 0 0 3 
In view of these comments it is not unreasonable to restrict the search for 


triangular forms to those whose top right corner entry is 0. 
For A, the search is for vectors u, v, and w such that 


Au = 3u, Av=u+3v, and Aw=v+3w. 


As for u, that’s already at hand—that’s the eigenvector (1,2,3) found 
above. 

If v = (a, ß,y) (the notation of the calculation that led to u is now 
abandoned), then the equation for v says that 


These equations are just as easy to solve as the ones that led to u. Their 
solutions are the vectors of the form (a,2a + 1,3a + 3)—a space of 
dimension 1. One of them (one is enough) is (0,1,3)—call that one v. 
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If w = (a, @,¥) (another release of old notation), then the equation for w 
becomes 
a+ £ = 3a 
a+264+ y=3641 
3a — 6p + 6y = 37 + 3. 


The solutions of these equations are the vectors of the form (a, 2a, 3a + 1); 
a typical one of which (with a = 1) is w = (1, 2, 4). 

The vectors u, v, and w so obtained constitute the basis; the matrix of A 
with respect to that basis is 


O o w 
O wwe 
woro 


as it should be. 
The procedure for B is entirely similar. Begin with the eigenvector u = 
(1, 2, 3) with eigenvalue 1, and then look for a vector v such that 


Bv = u+ 3v. 


If v = (a, B, y), this equation becomes 


a+ B = 3a +1 
—4a + 58 = 3642 
—6a + 36+ 3y = 3y + 3, 


and the solutions of that are the vectors of the form (a, 2a + 1, 3). If w is the 
one with a = 0, so that w = (0, 1,3), then the vectors u, v, and w constitute 
a basis, and the matrix of B with respect to that basis is 


3 10 
0 3 0 
0 0 3 


Solution 109. 


If n is 2, the answer is trivially yes. If the question concerned C” instead 
of R” (with the understanding that in the complex case the dimension be- 
ing asked about is the complex dimension), the answer would be easily yes 
again; just triangularize and look. One way of proving that the answer to 
the original question is yes for every n is to “complexify” R” and the lin- 
ear transformations that act on it. There are sophisticated ways of doing 
that for completely general real vector spaces, but in the case of R” there 
is hardly anything to do. Just recall that if A is a linear transformation on 
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R”, then A can be defined by a matrix (with real entries, of course), and such 
a matrix defines at the same time a linear transformation (call it AT) on C”. 

The linear transformation A* on C” has an eigenvalue and a correspond- 
ing eigenvector; that is 


Atz = Àz 


for some complex number À and for some vector z in C’. Consider the real 
and imaginary parts of the complex number A and, similarly, separate out the 
real and imaginary parts of the coordinates of the vector z. Some notation 
would be helpful; write 


A=qa+iß, 
with a and 8 real, and 
z =x% +ity, 
with x and y in R”. Since 
At (x + iy) = (a + ib) (x + iy), 
it follows that 
Ax = ax — By 
and 
Ay = Px + ay. 


There it is—that implies the desired conclusion: the subspace of R” 
spanned by x and y is invariant under A. 


Solution 110. 


Yes, if a linear transformation A on a finite-dimensional (complex) vector 
space is such that A* = 1 for some positive integer k, then A is diagonaliz- 
able. Here is the reasoning. 

The assumption implies that every eigenvalue A of A is a kth root of 
unity. Consequence: each block in a triangularization of A is of the form 
A +T, where \* = 1 and where 
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is strictly upper triangular. By the binomial theorem, 


(A+T)* =14+kT+---, 


where the possible additional terms do not contribute to the lowest non-zero 
diagonal of T. Conclusion: (\ +7)" can be 1 only when T = 0, that is, only 
when each block in the triangularization is diagonal. 


Solution 111. 


Since M(x) is spanned by the q vectors 


xz, Ax, A?a,..., A 1a 


? 


its dimension cannot be more than q; the answer to the question is that for 
an intelligently chosen x that dimension can actually attain the value q. The 
intelligent choice is not too difficult. Since the index of A is q, there must 
exist at least one vector xo such that 


AT! to Æ 0, 


and each such vector constitutes an intelligent choice. 
The assertion is that if 


Qozo + a1 Azo + az A’ £o a A ag AT !To =0, 


then each a; must be 0. If that is not true, then choose the smallest index j 
such that a; # 0. (If ao Æ 0, then of course j = 0.) It makes life a little 
simpler now to normalize the assumed linear dependence equation: divide 
through by a; and transpose all but A/zo to the right side. The result is 
an equation that expresses Axo as a linear combination of vectors obtained 
from ao by applying the higher powers of A (that is, the powers A? with 
k 2 j + 1). Consequence: 


Alay = Atty 
for some y. Since 
At-lag = ATII Airo = AP) I Al tly = Aly = 0 


(the last equal sign is justified by the assumption that A is nilpotent of index 
q), a contradiction has arrived. (Remember the choice of zg.) Since the only 
possibly shaky step that led here was the choice of j, the forced conclusion 
is that that choice is not possible. In other words, there is no smallest index 
j for which a; 4 0—which says that a; = 0 for all j. 
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Corollary. The index of nilpotence of a transformation on an space of 
dimension n can never be greater than n. 


Solution 112. 


Perhaps somewhat surprisingly, the answer depends on size. If the dimension 
of the underlying space is 2, or, equivalently, if A and B denote 2 x 2 matrices, 
then AB and BA always have the same characteristic polynomial, and it 
follows that if AB is nilpotent, then so is BA. If a matrix of size 2 is 
nilpotent, then its index of nilpotence is less than or equal to 2. 

For 3 x 3 matrices the conclusion is false. If, for instance, 


1 0 0 0 0 0 
A=1{0 1 0 and B={]1 0 0 
0 0 0 0 1 0 
then 
0 0 0 
AB=1{1 0 0 
0 0 0 


is nilpotent of index 2, but BA = A is not; it is nilpotent of index 3. 


Solution 113. 


The result of applying M to a vector (a, 3,7, 6,¢€) is (8 +6, y — €, 0, £, 0). 
When is that 0—or, in other words, which vectors are in the kernel of M? 
Answer: € must be 0, hence y must be 0, and 3+ ô must be 0. So: the kernel 
consists of all vectors of the form 


(a, 8,0, 8.0); 


a subspace of dimension 2. In view of this observation, and in view of the 
given form of M, a reasonable hope is to begin the desired basis with 


(1, 0, 0, 0, 0) 
(0, 1, 0, 0, 0) 
(0, 0, 1, 0, 0) 
and 
(0, —1, 0, 1, 0). 


What is wanted for a fifth vector is one whose image under M is (0, —1, 0, 1, 0). 
Since the image of (a, B, y, ô, €) is (8 + ô, y — €,0,¢,0), what is wanted is 
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to have G+ 6 = 0, y — € = —1, and £ = 1. These equations have many 
solutions; the simplest among them is 


(0,0, 0,0, 1). 


That’s it: the last five displayed vectors do the job. 


Solution 114. 


The answer is no but yes. No, not every matrix has a square root, but the 
reason is obvious (once you see it), and there is a natural way to get around 
the obstacle. 

An example of a matrix with no square root is 


0 1 0 
A={0 0 1 
0 0 0 


(So is 0 


it works.) If, indeed, it were true that A = B?, then (since A3 = 0) it would 
follow that Bê = 0, and hence that B is nilpotent. A nilpotent matrix of 
size 3 x 3 must have index less than or equal to 3 (since the index is always 
less than or equal to the dimension)—and that implies B® = 0, and since 
Bt = A? £0, that is a contradiction. 

What’s wrong? The answer is 0. People familiar with the theory of mul- 
tivalued analytic functions know that the point z = 0 is one at which the 
function defined by \/z misbehaves; the better part of valor dictates that in 
the study of square roots anything like 0 should be avoided. What in matrix 
theory is “anything like 0”? Reasonable looking answer: matrices that have 
0 in their spectrum. How are they to be avoided? Answer: by sticking to in- 
vertible matrices. Very well then: does every invertible matrix have a square 
root? 

Here is where the Jordan form can be used to good advantage. Every 
invertible matrix is similar to a direct sum of matrices such as 


ae but the larger example gives a little more of an idea of why 


A 1 0 0 
0 A 1 O 
00 2X 1]? 
0 0 0 4X 


with A Æ 0, and, consequently, it is sufficient to decide whether or not every 
matrix of that form has a square root. 

The computations are somewhat easier in case \ = 1, and it is possible 
to reduce to that case simply by dividing by A. When that is done, the 1’s 
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above the diagonal turn into +’s, to be sure, but in that position they cause 
no trouble. So the problem is to find a square root for something like 


o Qo - 2 
Core o 
EB e .O OS 


1 
0 
0 
0 


One way to do that is to look for a square root of the form 


oo O me 
O Orem 
orm Ss 
mums A 


Set the square of that matrix equal to the given one and look for solutions z, 
y, z of the resulting equations. That works! 

There is a more sophisticated approach. Think of the given matrix as 
I+ M, where 


ooo oO 
ooo R 
So Ooo oS 
on OO 


The reason that’s convenient is that it makes possible the application of facts 
about the function VI +Ç. 

As is well known, Professor Moriarty “wrote a treatise upon the binomial 
theorem, which has had a European vogue”; the theorem asserts that the power 
series expansion of (1 + ¢)§, is 


aroe=14(f)c+ (See (flere 


(Here a binomial coefficient such as, for instance, a) denotes 


ete =e 2) 
31, 


and the parameter € can be any real number.) The series converges for some 
values of ¢ and does not converge for others, but, for the moment, none of 
that matters. What does matter is that the equation is “formally” right. That 
means, for instance, that if the series for € = 4 is multiplied by itself, then the 
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constant term and the coefficient of ¢ turn out to be 1 and all other coefficients 
turn out to be 0—the product is exactly 1+ ¢. In the application that is about 
to be made the variable ¢ will be replaced by a nilpotent matrix, so that only 
a finite number of non-zero terms will appear—and in that case convergence 
is not a worry. 

All right: consider the series with k = 4, and replace the variable ¢ by 
the matrix M. The result is 


2 ŽŽ at 
2 4 16 

Q a? 
Me. oe. Sa 

a 
0 0 1 5 
0 0 0 1 


(check?), and that works—meaning that its square is 1 + M (check?). So, 
one way or another, it is indeed true that every invertible matrix has a square 
root. 


Solution 115. 115 


The differentiation transformation D is nilpotent of index 4 (the dimension of 
the space). Consequence: both the minimal polynomial and the characteristic 
polynomial are equal to Af. 

As for T, its only eigenvalue is 1. Indeed: if 


a+ BE +1) + y(t +1)? + 6(£ + 1)? = Ala + BE+ yt? + ôt?) 


then 
at B+ y+ 6=da, 
B+2y+36= A9, 
y +36 = Ay, 
6 = XO. 


It follows that if A A 1, then 


and therefore 
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On the other hand if = 1, then 


B+ YT ô= 0, 
2y + 36 = 0, 
36 = 0, 
and therefore 
6=y=6=0. 


(Another way to get here is to look at the matrix in Solution 108.) Conclusion: 
both the minimal polynomial and the characteristic polynomial are (A — 1)*. 


Solution 116. 


Yes, it’s always true that one polynomial can do on each of n prescribed 
transformations what n prescribed polynomials do. The case n = 2 is typical 
and notationally much less cumbersome; here is how it goes. 

Given: two linear transformations A and B with disjoint spectra, and two 
polynomials p and q. Wanted: a polynomial r such that 


and 
r(B) = q(B). 
If there is such a polynomial r, then the difference r — p annihilates 
A. The full annihilator of A, that is the set of all polynomials f such that 
f(A) = 0, is an ideal in the ring of all complex polynomials; every such 


polynomial is a multiple of the minimal polynomial po of A. Consequence: 
if there is an r of the kind sought, then 


r = spo +p 


for some polynomial p, and, similarly, 


r= tqo +4, 


where q is the minimal polynomial of B. Conversely, clearly, any spy + p 
maps A onto p(A), and any tqo + q maps B onto q(B); the problem is to 
find an r that is simultaneously an spo + p and a tqo + q. In other words, the 
problem is to find polynomials s and t such that 


spo — tqo = q — P. 
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Since po and qo are relatively prime (this is the step that uses the assumed 
disjointness of the spectra of A and B), it is a standard consequence of the 
Euclidean algorithm that such polynomials s and t do exist. 
The general case (n > 2) can be treated either by imitating the special 
case or else by induction. Here is how the induction argument goes. 
Assume the conclusion for n, and pass to n + 1 as follows. By the 
induction hypothesis, there is a polynomial p such that 


P(Aj) = pj(Aj) 
for j =1,...,n. Write 
A=A,®-:-@An 
(direct sum), 
B= Ani, 
and 
q = Pj+1- 


Note that the spectra of A and B are disjoint (because the spectrum of A is 
the union of the spectra of the A;’s, j = 1, . . ., n), and therefore, by the case 
n = 2 of the theorem, there exists a polynomial r such that 


and 
r(B) = q(B). 


Once the notation is unwound, these equations become 


r(A1) $: @r(An) = pi(A1) ® ++: ® pn(An) 
and 
r(An+1) = Pn4i(Ant1). 


The first of these equations implies that 


7(Aj) = p;(Aj) 


for j = 1,...,n, and that concludes the proof. 

The result holds for all fields, not only C, provided that the hypothesis 
of the disjointness of spectra is replaced by its algebraically more usable 
version, namely the pairwise relative primeness of the minimal polynomials 
of the given transformations. 
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Chapter 8. Inner Product Spaces 


Solution 117. 


An orthogonal set of non-zero vectors is always linearly independent. (The 
case in which one of them is zero is degenerate—then, of course, they are 
dependent.) Indeed, if 


Qızı + +++ + Anty = 0, 


form the inner product of both sides of the equation with any x; and get 
a;(x;, Xj) = 0. 


The reason is that if i # j, then the inner product (x;, æj) is 0; that’s what 
the assumed orthogonality says. Since (xj, xj) Æ 0 (by the assumed non- 
zeroness), it follows that a; = 0—every linear dependence relation must be 
trivial. 


Solution 118. 


The answer to the question as posed is no: different inner products must yield 
different norms. The proof is a hard one to discover but a boring one to 
verify—the answer is implied by the equation 

2 2 


+i 


2 2 
=ef 


S(o+iy)|) -ile-i 


a= [Fee +) | 


-|5e-» 


which is called the polarization formula. It might be somewhat frighten- 
ing when first encountered, but it doesn’t take long to understand, and once 
it’s absorbed it is useful—it is worth remembering, or, at the very least, its 
existence is worth remembering. 


Solution 119. 
What is always true is that 
le + yll? = lizel? + (x,y) + (u2) + llul. 
For real vector spaces the two cross product terms are equal; the equation 
le + yll? = lel? + lll? 


is equivalent to (x, y) = 0, and all is well. 
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In complex vector spaces, however, (y, x) is the complex conjugate of 
(x,y); the sum of the two cross product terms is 2Re(x,y). The equation 
between norms is equivalent to Re(x,y) = 0, and that is not the same as 
orthogonality. An obvious way to try to construct a concrete counterexample 
is to start with an arbitrary vector x and set y = ix. In that case 


lle + yll? = [1 + ell? = 2\ 2]? = lell? + Iyl’, 


but (except in the degenerate case x = 0) the vectors x and y are not orthog- 
onal. 


Solution 120. 
Multiply out ||a + y||? + Ilx — y||?, get 


llel? + (x,y) + (u, 2) + Iyl? + lell? — (æy) = (u, £) + Ilyll?, 


and conclude that the equation in the statement of the problem is in fact an 
identity, true for all vectors x and y in all inner product spaces. 


Solution 121. 


Yes, every inner product space of dimension n has an orthonormal set of 
n elements. Indeed consider, to begin with, an arbitrary orthonormal set. If 
no larger one jumps to the eye, a set with one element will do: take an 


arbitrary non-zero vector x, and normalize it (that is replace it by Ta If 
x 


the orthonormal set on hand is not maximal, enlarge it, and if the resulting 
orthonormal set is still not maximal, enlarge it again, and proceed in this 
way by induction. Since an orthonormal set can contain at most n elements 
(Problem 111), this process leads to a complete orthonormal set in at most n 
steps. 

Assertion: such a set spans the whole space. Reason: if the set is 
{%1,22,...}, and if some vector x is not a linear combination of the z;’s, 
then form the vector 


y=r— y(i d )tj: 
J 


The assumption about x implies that y # 0. Since, moreover, 


(y, xi) = (x, xi) — Dole, #1) bis = (x, xi) — (£, xi) = 0, 


so that y is orthogonal to each of the 2;’s, the normalized vector —, when 


llull? 
adjoined to the x;’s, leads to a larger orthonormal set. That’s a contradiction, 
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and, therefore, the x;’s do indeed span the space. Since they are also linearly 
independent (Problem 111 keeps coming up), it follows that they constitute a 
basis, and hence that there must be n of them. 


Comment. There is a different way to express the proof, a more constructive 
way. The idea is to start with a basis {21,...,2,} and by continued modifi- 
cations convert it to an orthonormal set. Here is an outline of how that goes. 


Since xı Æ 0, it is possible to form yı = Z, Once Y1,---,Yr have been 
x 


Ileal] 


found so that each y; is a linear combination of x1, ..., £j, form 


i 
Tr+1 = X (ary, Yj)Yj» 
j=1 
verify that it is linearly independent of yı, ..., Yr, and normalize it. These 
steps are known as the Gram-Schmidt orthogonalization process. 


Solution 122. 


If x and y are the same vector, then both sides of the Schwarz inequality are 
equal to ||x||?. More generally if one of x and y is a scalar multiple of the 
other (in that case there is no loss of generality in assuming that y = ax), 
then both sides of the inequality are equal to |a|-||x||*. If £ and y are linearly 
dependent, then one of them is a scalar multiple of the other. In all these cases 
the Schwarz inequality becomes an equation—can the increasing generality 
of this sequence of statements be increased still further? The answer is no: 
the Schwarz inequality can become an equation for linearly dependent pairs 
of vectors only. 
One proof of that assertion is by black magic, as follows. If 


(æy) = Iel lall, 


replace x by yx, where y is a complex number of absolute value 1 chosen so 
that y(x, y) is real. The assumed “Schwarz equation” is still true, but with 
the new «x (and the same old y) it takes the form 


(x,y) = |lell- IIyll- 


This is not an important step—it just makes the black magic that follows a 
tiny bit more mysterious still. Once that step has been taken, evaluate the 
expression 


2 
|iie- Lelia] 
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Since it is equal to 


(Illia — lel ly, [ele — lælly) = lellu? — 21 ket Pyl? + elly]? = 0, 


it follows that ||y||a—||2||y = 0, which is indeed a linear dependence between 
x and y. 

One reason why the Schwarz inequality is true, and why equality happens 
only in the presence of linear dependence, can be seen by looking at simple 
special cases. Look, for instance, at two vectors in R’, say x = (a, 3) and 
y = (1,0). Then 


lell = vla? +18, [lll =4, and (#,y) = a; 


the Schwarz inequality reduces to the statement 


lal S Val? + [6l?, 


which becomes an equation just when 3 = 0. 

An approach to the theorem that is neither black magic nor overly sim- 
plistic could go like this. Assume that | (x, y)| = ||2]|- ||y|| and, temporarily 
fixing a real parameter a, consider 


Iz — ay||? = (£ — ay, z — ay) = |z|? — 2Re(a, ay) + lal’llyll?. 


This indicates why changing «x so as to make (x, y) real is a helpful thing to 
do; if that’s done, then the right term becomes 


(Ilzl| — lal- llul)”. 


Inspiration: choose the parameter œ so as to make that term equal to 0 (which 
explains the reason for writing down the black magic expression)—the pos- 
sibility of such a choice proves that x — ay can be made equal to 0, which 
is a statement of linear dependence. 


Solution 123. 


If M is a subspace of a finite-dimensional inner product space V, then M and 
M+ are complements (Problem 28), and M++ = M. For the proof, consider 
an orthonormal basis {21,...,@m} for the subspace M. If z is an arbitrary 
vector in V, form the vectors 


g= ee Ly) Xi and y=z-r. 
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Since x is a linear combination of the x,;’s, it belongs to M, and since y is 
orthogonal to each z; it belongs to M+. Consequence: 


V=M+M-. 


If a vector u belongs to both M and Mr, then (u, u) = 0 (by the definition 
of M+): that implies, of course, that u = 0, that is that the subspaces M and 
M+ are disjoint. Conclusion (in the language of Problem 50): V is the direct 
sum of M and M+, and that’s as good a relation between M and M as can 
be hoped for. 

The definitions of x and y imply that 


(z,2) = (x +y, x) = ||æll? + (y,2) = lle)’, 


and, similarly, 


(z,y) = (e@+y,y) = (x,y) + [lyll? = Iyl]? 


It follows that if z is in M++, so that (z,y) = 0, then ||y||? = 0, so that 
z = and therefore z is in M; in other words M++ c M. Since the reverse 
inclusion M C M++ is already known, it now follows that M = M++, and 
that’s as good a relation between M and M++ as can be hoped for. 


Solution 124. 


The answer is yes: every linear functional € on an inner product space V is 
induced as an inner product. For the proof it is good to look at the vectors x 
for which (x) = 0. If every x is like that, then € = 0, and there is nothing 
more to say. In any case, the kernel of £ is a subspace of Y, and it is pertinent 
to consider its orthogonal complement, which it is convenient to denote by 
kert €. If ker é Æ V (and that may now be assumed), then kert € contains 
at least one non-zero vector %%. It is true in fact (even though for present 
purposes it is not strictly needed) that kert £ consists of all scalar multiples 
of any such vector y; in other words, the subspace kert € has dimension 1. 
Indeed: if y is in kert €, then so is every vector of the form y — ay. The 
value of € at such a vector, that is 


Ely — ayo) = Ely) — a€(yo), 
can be made equal to 0 by a suitable choice of the scalar a (namely, 


ie SOR 
(yo) 


which means, for that value of €, that y — Eyo belongs to both keré and 
ker* €. Conclusion: y — ayo = 0, that is y = ayo. 
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The vector yo “works” just fine for the vectors in ker €, meaning that if 
x is in ker £, then 


E(x) = (x, yo) 
(because both sides are equal to 0), and the same thing is true for every scalar 
multiple of yo. Does the vector yọ work for the vectors in kert € also? That 
is: is it true for an arbitrary element ayo in kert € that 
u(ayo) = (ayo, yo)? 


The equation is equivalent to 


E(yo) = Ilyol|?, 


and there is no reason why that must be true, but, obviously, it can be true if 
yo 1s replaced by a suitable scalar multiple of itself. Indeed: if yo is replaced 
by yyo, the desired equation reduces to 


Elvo) = lal? + IIyoll?, 
which can be satisfied by choosing y so that 


E(yo) = Flyoll?. 


Solution 125. 


(a) Since by the very definition of adjoints (U*1, Ç) is always equal to (n, UÇ), 
the way to determine U* is to calculate with (U*n, ¢). That’s not inspiring, 
but it is doable. The way to do it is to begin with 


(U* (21, y1), (£2, y2)) 


and juggle till it becomes an inner product with the same second term (x2, y2) 
and a pleasant, simple first term that does not explicitly involve U. The 
beginning is natural enough: 


(U* (21, 91), (x2, y2)) = ((&1,m), U (£2, y2)) 
= ((#1, 91), (Y2, —T2)) = (z1, Y2) — (Y1, 2). 


That’s an inner product all right, but it is one whose second term is (y2, x2) 
instead of (x2, y2). Easy to fix: 


(U* (x1, y1), (2, y2)) = (—y1, £2) + (£1, y2) = ((—y1, £1), (£2, y2)). 
That does it: the identity so derived implies that 


U* (x,y) = (-y, x), 
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and that’s the sort of thing that is wanted. What it shows is a surprise: it 
shows that 


U* = -U. 
The calculation of U*U is now trivial: 
U*U (x, y) =U" (y, —£) oo (x, Y), 
so that U*U is equal to the identity transformation. The verification that U U * 
is the same thing is equally trivial. 


(b) Yes, a graph is always a subspace. The verification is direct: if 
(x1, y1) and (x2, y2) are in the graph of A, so that 


Y= Ax, and Y2 = Arg, 
then a1 (z1, y1) + @2(X2, y2) is in the graph of A, because 


ary + agy2 = Axı + Ate. 


(c) The graph of A* is the orthogonal complement of the image under 
U of the graph of A. To prove that, note that the graph of A is the set of all 
pairs of the form (x, Ax), and hence the U image of that graph is the set of 
all pairs of the form (—Az,x). The orthogonal complement (in Y $ V) of 
that image is the set of all those pairs (u, v) for which 


(—Az,u) + (2, v) =0 
identically in x. That means that 
(x, -A*u+v) =0 


for all x, and hence that A*u = v. The set of all pairs (u,v) for which 
A*u = v is the set of all pairs of the form (u, A*u), and that’s just the graph 
of A*. 

That wasn’t bad to verify—was it?—but how could it have been discov- 
ered? That sort of question is always worth thinking about. 


Solution 126. 
(a) Yes, congruence is an equivalence relation. Indeed, clearly, 
A=P*AP (with P = 1); 
if B= P*AP, thn A=Q*BQ (withQ = P7); 
and 


if B=P*AP and C=Q*BQ, then C = R*AR (with R= PQ). 
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(b) Yes: if B = P* AP, then B* = P* A*P. 

(c) No: a transformation congruent to a scalar doesn’t have to be a scalar. 
Indeed, if P is an arbitrary invertible transformation such that P*P is not 
a scalar (such things abound), then P*P (= P*-1- P) is congruent to the 
scalar 1 without being equal to it. 

(d) The answer to this one is not obvious—some head scratching is 
needed. The correct answer is yes: it is possible for A and B to be congruent 
without A? and B? being congruent. Here is one example: 


0 1 0 1 
A=(j a and Fa ar 


The computation is easy. If 


then 


rae MOEN ICDC 


so that, indeed, A is congruent to B. Since, however, 


0 0 0 
= (1 j and sal a 


it follows that A? cannot be congruent to B?. (Is a microsecond’s thought 
necessary? Can the transformation 0 be congruent to a transformation that is 
not 0? No: since P* - 0- P = 0, it follows that being congruent to 0 implies 
being equal to 0.) 


Solution 127. 


The desired statement is the converse of a trivial one: if A = 0, then (Az, x) = 
O for all x. In the non-trivial direction the corresponding statement about 
sesquilinear forms (in place of quadratic ones) is accessible: if (Ax, y) = 0 
for all x and y, then A = 0. Proof: set y = Ax. A possible approach to the 
quadratic result, therefore, is to reduce it to the sesquilinear one—try to prove 
that if (Ax, x) = 0 for all x, then (Az, y) = 0 for all x and y. 

What is wanted is (or should be?) reminiscent of polarization (Solution 
118). What that formula does is express the natural sesquilinear form (x, y) in 
terms of the natural quadratic form ||z||?. Can that expression be generalized? 
Yes, it can, and the generalization is no more troublesome than the original 
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version. It looks like this: 


Once that’s done, everything is done: if (Az, z) is identically 0, then so is 
(Az, y). 


Solution 128. 


The product of two Hermitian transformations is not always Hermitian—or, 
equivalently, the product of two conjugate symmetric matrices is not always 
conjugate symmetric. It is hard not to write down an example. Here is one: 


0 1 1 0\ /0 2 

1 0 0 2) U 0j’ 
Does the order matter? Yes, it matters in the sense that if the same two 
matrices are multiplied in the other order, then they give a different answer, 


(o a) (0 o)=G a): 


but the answer “no” does not change to the answer “yes”. 
How likely is the product of two Hermitian transformations to be Her- 
mitian? If A and B are Hermitian, and if AB also is Hermitian, then 


(AB)* = AB, 


which implies that BA = AB. What this proves is that for the product of two 
Hermitian transformations to be Hermitian it is necessary that they commute. 
Is the condition sufficient also? Sure—just read the argument backward. 


Solution 129. 
(a) If B = P*AP and A* = —A, then 
B* = P*(—A)P =—P*AP = -B. 


Conclusion: a transformation congruent to a skew one is skew itself. 
(b) If A* = —A, then 


(A’)* = (£) = (-A)? = A’, 


which is not necessarily the same as — A”. Conclusion: the square of a skew 
transformation doesn’t have to be skew. Sermon: this is an incomplete proof. 
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For perfect honesty it should be accompanied by a concrete example of a 
skew transformation A such that A? 4 —A?. One of the simplest such trans- 
formations is given by the matrix 


(Aa): 


As for A®, since (—1)? = —1, it follows that A* = —A implies (A*)* = 
—A?, so that A* is skew along with A. 
(c) Write 
S (for sum) = AB + BA 
and 


D (for difference) = AB — BA. 
The question is: what happens to 
S* = B* A* + A* B* 
and 
D* = B* A* — A* B* 


when A* and B* are replaced by A and B, possibly with changes of sign? 
The answer is that if the number of sign changes is even (0 or 2), then S 
remains Hermitian and D remains skew, but if the number of sign changes is 
odd (which has to mean 1), then S becomes skew and D becomes Hermitian. 


Solution 130. 
If A = A*, then 


(Az, x) = (a, A*x) = (x, Ax) = (Az, x), 


so that (Ax, x) is equal to its own conjugate and is therefore real. If, con- 
versely, (Ax, x) is always real, then 


(Ax, x) = (Az, x) = (x, A*x) = (A*z, 2), 
so that ((A — A*)a, x) = 0 for all x, and, by Problem 127, A = A*. 


Solution 131. 


(a) The entries not on the main diagonal influence positiveness less than the 
ones on it. So, for example, from the known positiveness of 


Ga) 
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it is easy to infer the positiveness of 
2 -1 
-1 1/0 
Dean 
(b) Yes, and an example has already been seen, namely 9 1) 
(c) A careful look at 
1 1 
aaa 
1 1 
shows that the quadratic form associated with it is 


lf + £2 + &3)?, 


and that answers the question: yes, the matrix is positive. 
(d) The quadratic form associated with 


1 0 1 
0 1 0 
1 0 1 


l&i + €3|? + lé2l?, 


and that settles the matter; yes, the matrix is positive. 
(e) The quadratic form associated with 


a 1 1 
1 0 0 
1 0 0 


aE |? F 2ReE £2 + 2Re€ 13 


and the more one looks at that, the less positive it looks. It doesn’t really 
matter what €3 is—it will do no harm to set it equal to 0. The enemy is the 
coefficient a, and it can be conquered. No matter what a is, choose & to be 
1, and then choose & to be a gigantic negative number—the resulting value 
of the quadratic form will be negative. The answer to the question as posed 
is: none. 


Solution 132. 


Yes, if a positive transformation is invertible, then its inverse also is positive. 
The proof takes one line, but a trick has to be thought of. 
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How does it follow from (Az, x) 2 0 for all x that (A~ty, y) 2 0 for 
all y? Answer: put y = Az. Indeed, then 
(Aty, y) = (A71 Az, Ax) = (x, Ax) = (Az, x), 
and the proof is complete. 
(Is the reason for the last equality sign clear? Since AT! is positive, A~! 
is Hermitian, and therefore A is Hermitian.) 


Solution 133. 


If E is the perpendicular projection onto M, so that E is the projection onto 
M along M+, then Problem 82 implies that E* is the perpendicular projec- 
tion onto (M+)+ along M+. (Problem 82 talks about annihilators instead of 
orthogonal complements, but the two languages can be translated back and 
forth mechanically.) That means that E* is the perpendicular projection onto 
M (along M+)—and that is exactly Æ. 

If, conversely, HE = E? = E*, then the idempotence of E guarantees that 
E is the projection onto ran E along ker E (Problem 72). If x is in ran E 
and y is in ker E, then 


(x,y) = (x,y) (because the vectors in the range of a 
projection are fixed points of it—see Problem 72) 
= (x, E*y) (just by the definition of adjoints) 
= (x, Ey) (because Æ was assumed to be Hermitian) 
=0 (because y is in ker E). 
Consequence: ran E and ker E are not only complements—they are orthog- 


onal complements, and, therefore, Æ is a perpendicular projection. 


Summary. Perpendicular projections are exactly those linear transformations 
that are both Hermitian and idempotent. 


Solution 134. 


Since a perpendicular projection is Hermitian, the matrix of a projection on 
C? must always look like 
5) 
BE ASS 


where a and y must be real, and must, in fact, be in the unit interval. (Why?) 
The question then is just this: which of the matrices that look like that are 
idempotent? 
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To get the answer, compute. If 


(5 a IB o 
BAIS NB Y ab +y IBP +7 
then (top right corner) a + y = 1, so that 


(2) 


y=l-a. 

Consequence (lower right corner): |8|? + (1 — a)? = 1 — a, which simplifies 

to 

|3|? = a(l — a). 
Conclusion: the matrices of projections on C? are exactly the ones of the form 
a 6\/a(1 -— a) 
vall- a) l-a 
where 
OSasl and jol =1. 

Comment. 


The case 8 = 0 seems to be more important than any other; in 
any event it is the one we are most likely to bump into. 


135 Solution 135. 


If E and F are projections, with ran E = M and ranF = N, then the 
statements 


ESF and 


McN 
are equivalent. 


Suppose, indeed, that Æ < F. If x is in M, then 


(Fx, x) 2 (Ex, x)= (2,2), 
Since the reverse inequality 


(x, x) 2 (Fx, x) 
is always true, it follows that 


((1— F)z, £) = 0, 
and hence that 


|a- F)2||? =0. 


(Why is the last “hence” true?) Conclusion: Fg = x, so that x is in M. 
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If, conversely, M C N, the FEx = Ex (because Fx is in M for all x), 
so that FE = E. It follows (from adjoints) that EF = E, and that justifies 
a small computational trick: 


(Ex, x) = ||Ea||? = ||EFz||? < ||F2||? = (Fz, x). 


Conclusion: E < F. 


Solution 136. 


If E and F are projections with ran E = M and ran F = N, then the 
statements 


MLN and EF =0 


are equivalent. 
Suppose indeed that EF = 0. If x is in M and y is in N, then 


(x,y) = (Ex, Fy) = (x, E* Fy) = (x, EFy) = 0, 
If, conversely, M L N, so that 
Nc MŁ, 


then, since Fx is in N for all z, it follows that Fx is in M+ for all z. 
Conclusion: EFx = 0 for all x. 


Solution 137. 


If A is Hermitian, and if x is a non-zero vector such that Ax = Az, then, of 
course, 


(Ag, x) = A(x, x); 


since (Ax, x) is real (Problem 130), it follows that À is real. If, in addition, 
A is positive, so that (Ax, x) is positive, then it follows that A is positive. 

Note that these conditions on the eigenvalues of Hermitian and positive 
transformations are necessary, but nowhere near sufficient. 


Solution 138. 


The answer is that for Hermitian transformations eigenvectors belonging to 
distinct eigenvalues must be orthogonal. 
Suppose, indeed, that 


Av, = Ay Ly and ÅT = A2X2, 
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with A; Æ Ag. If A is Hermitian, then 
Ai (a1, £2) = (Av, £2) = (41, Axe) (why?) 
= Ao (x1, £2) (why?). 
Since A; Æ A2, it must follow that (11,72) = 0. 


Comment. Since the product of the eigenvalues of a transformation on a 
finite-dimensional complex vector space is equal to its determinant (remember 
triangularization), these results imply that the determinant of a Hermitian 
transformation is real. Is there an obvious other way to get the same result? 


Chapter 9. Normality 


Solution 139. 


Caution: the answer depends on whether the underlying vector space is of 
finite or infinite dimension. 

For finite-dimensional spaces the answer is yes. Indeed, if U*U = 1, 
then U must be injective, and therefore surjective (Problem 66), and therefore 
invertible (definition), and once that’s known the equation U*U = 1 can be 
multiplied by U~! on the right to get U* = UTH, 

For infinite-dimensional spaces the answer may be no. Consider, indeed, 
the set V of all finitely non-zero infinite sequences 


{E15 &2, E3, aS J} 


of complex numbers. The phrase “finitely non-zero” means that each sequence 
has only a finite number of non-zero terms (though that finite number might 
vary from sequence to sequence). With the obvious way of adding sequences 
and multiplying them by complex scalars, V is a complex vector space. With 
the definition 


({&, £2, £3, pi J {m, N2, 13, ++ +}) E 5 EnTm, 
n=1 


the space V becomes an inner product space. If U and W are defined by 


U{E1, E2, &3, He -} = {0, Gis E2, &3, oi J} 


and 


Wifi, £9, E3, pa af = {&2, E3, E4, 4 Ja 
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then U and W are linear transformations on V, and a simple computation 
establishes that the equation 


(Uz, y) = (x, Wy) 


is true for every pair of vectors x and y in V. In other words W is exactly 
the adjoint U* of U, and, as another, even simpler, computation shows 


U*U =1. 


(Caution: it is essential to keep in mind that when U *U is applied to a vector 
x, the transformation U is applied first.) It is, however, not true that UU* = 1. 
Not only does U* fail to be the inverse of U, but in fact U has no inverse 
at all. The range of U contains only those vectors whose first coordinate is 
0, so that the range of U is not the entire space W—that’s what rules out 
invertibility. 


Solution 140. 


When is fe B 
y ô 


and only if the product of 


a BN“ a B 
Gs J us 6 a 


is the identity matrix. Since 


& JE ate o, 
y 6 y 6 aß +y IBP +l)” 


that condition says that 


) the matrix of a unitary transformation on C?? Answer: if 


lal? +y? = |6|? +ô? =1 and &8+7ô= 0, 


or, in other words that the vectors (œ, 6) and (7,6) in C? constitute an 
orthonormal set. 

This 2 x 2 calculation extends to the general case. If U is a linear trans- 
formation on a finite-dimensional inner product space, and if the matrix of U 
with respect to an orthonormal basis is (u;;), then a necessary and sufficient 
condition that U be unitary is that 


X TkiUkj = ij. 
k 


That matrix equation is, in fact, just the equation U*U = 1 in matrix notation. 
These comments make it easy to answer the questions about the special 
matrices in (a), (b), and (c). 
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For (a): since the second row is not a unit vector, it doesn’t matter what 
a is, the matrix can never be unitary. 

For (b): the rows must be orthonormal unit vectors. Since the norm of 
each row is |a|? + 4, the condition of normality is equivalent to |a| = ae 
Since the inner product of the two rows is 3(—a +@), their orthogonality is 
equivalent to a being real. Conclusion: 


is unitary if and only if a = + v3 

For (c): the question is an awkward way of asking whether or not a 
multiple of (1, 1,1) can be the first term of an orthonormal set. The answer 
is: why not? In detail: if w is a complex cube root of 1, then the vectors 


(1,1, 1), (1,w,w?), and (1,w?, w) 


all have the same norm ea). normalization yields an explicit answer to the 
question. 


Solution 141. 
None of the three conditions U* = U, U*U = 1, and U? = 1 implies any of 


the others. Indeed, 
1 0 
0 2 


is Hermitian but neither unitary nor involutory; 


(4: 0) 


is unitary but neither Hermitian nor involutory; and 


1 -2 
(oY) 
is involutory but neither Hermitian nor unitary. 

The implicative power of pairs of these conditions is much greater than 
that of each single one; indeed it turns out that any two together imply the 
third. That’s very easy; here is how it goes. 

If U* = U, then the factor U* in U*U can be replaced by U, and, 
consequently, U*U = 1 implies U? = 1. 

If U*U = 1 and U? = 1, then of course U*U = U?; multiply by U~! 
(= U*) on the right and get U* = U. 

If, finally, U* = U, then one of the factors U in U? can be replaced by 
U*, and consequently, U? = 1 implies U*U = 1. 
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Solution 142. 


Each row and each column of a unitary matrix is a unit vector. If, in particular, 
a unitary matrix is triangular (upper triangular, say), then its first column is 
of the form 


(*,0,0,0,...), 


and, consequently, those entries in the first row that come after the first can 
contribute nothing—they must all be 0. Proceed inductively: now it’s known 
that the second column is of the form 


(0, *,0,0,...), 


and, consequently, those entries in the second row that come after the first 
two can contribute nothing—etc., etc. Conclusion: a triangular unitary matrix 
must be diagonal. 


Comment. This solution tacitly assumed that the matrices in question corre- 
spond to unitary transformations via orthonormal bases. A similar comment 
applies in the next problem, about Hermitian diagonalizability. 


Solution 143. 


The answer is yes; every Hermitian matrix is unitarily similar to a diagonal 
one. This result is one of the cornerstones of linear algebra (or, perhaps more 
modestly, of the part of linear algebra known as unitary geometry). Its proof 
is sometimes considered recondite, but with the tools already available here 
it is easy. 

Suppose, indeed, that A is a Hermitian transformation with the distinct 
eigenvalues 


Ngee eyiAy 
and corresponding eigenspaces M;: 
M; = {a: Ax = Aiz}, 
i= 1,...,r. Ifi Æ j (so that A; A àj), then 
M; L M; 


(by Problem 138). The M;’s must span the entire space. Reason: the restric- 
tion of A to the orthogonal complement of their span is still a Hermitian 
transformation and, as such, has eigenvalues and corresponding eigenspaces. 
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That settles everything. Just choose an orthonormal basis within each M;, 
and note that the union of all those little bases is an orthonormal basis for the 
whole space. Otherwise said: there exists an orthonormal basis 


T15+++-5Un 


of eigenvectors; the matrix of A with respect to that basis is diagonal. 


144 Solution 144. 


The answer is 1: every positive transformation has a unique positive square 
root. A quick proof of existence goes via diagonalization. If A 2 0, then, 
in particular, A is Hermitian, and, consequently, A can be represented by a 
diagonal matrix such as 


D OGOR 
OO 
2 oo 


The diagonal entries a, 3, y, . . . are the eigenvalues of A, and, therefore, they 
are real; since, moreover, A is positive, it follows that they are positive. Write 


va 0 0 
0 VB 0 
0 0 Wy 


(where the indicated numerical square roots are the positive ones), and jump 
happily to the conclusions that (i) B = 0 and (ii) B? = A. 

What about uniqueness? If C = 0 and C? = A, then C can be diagonal- 
ized, 


The numbers £, n, Ç, . . . are positive and their squares are the numbers a, 3, 7, . . — 
Q.E.D. 
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Solution 145. 


Every linear transformation A on a finite-dimensional inner product space is 
representable as 


A=UP 


with U unitary and P positive; if A is invertible, the representation is unique. 

To get a clue to a way of constructing U and P when only A is known, 
think backward: assume the result and try to let it suggest the method. If 
A= UP, then A* = PU*, and therefore 


A*A = P’. 


That’s a big hint: since A*A and P? are positive linear transformations, they 
have positive square roots; the equation P? = A*A implies the square root 
equation 


P=VA*A. 


That’s enough of a hint: given A, define P by the preceding equation, and 
then ask where U can come from. If A is to be equal to U P, then it’s tempting 
to “divide through” by P—which would make sense if P were invertible. All 
right: assume for a moment that A is invertible; in that case A* is invertible, 
and so are A*A and P. If U is defined by 


U = AP}, 
then 
U*U = P7!A*AP = P P?P =1, 


and victory has been achieved: U is indeed unitary. 
Uniqueness is not hard. If 


UP, = U2P>2, 
with Uı and U2 unitary and Pı and Pz invertible and positive, then 
P? = (U1 Pi )* (U1 Pi) = (U2P2)*(U2P2) = P3, 
and therefore (by the uniqueness of positive square roots) 
P, = Po. 


“Divide” the equation U; Pı = U2P> through by Pı (= P2) and conclude 
that Uy = Up. 
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If A is not invertible, the argument becomes a little more fussy. What is 
wanted is Ax = U Px for all x, or, writing y = Px, what is wanted is 


Uy = Px 


whenever y is in the range of P. Can that equation be used as a definition of 
U—is it an unambiguous definition? That is: if one and the same y is in the 
range of P for two reasons, 


y= Px, and y = Pao, 
must it then be true that Axı = Ax2? The answer is yes: write 
£ = T1 — T2 
and note the identity 
||Pz]||? = (Px, Px) = (P22, x) = (A* Az, £) = ||Aa||?. 
It implies that if Px = 0, then Ax = 0, or, in other words, that if 
Ptr = Px, 
then 
Ax, = Axo; 


the proposed definition of U is indeed unambiguous. 

Trouble: the proposed definition works on the range of P only; it defines a 
linear transformation U with domain equal to ran P and range equal to ran A. 
Since that linear transformation preserves lengths (and therefore distances), it 
follows that ran A and ran P have the same dimension. Consequence: rant A 
and ran+ P have the same dimension, and, consequently, there exists a linear 
transformation V that maps rant P onto ran+ A and that preserves lengths. 
Extend the transformation U (already defined on ran P) to the entire space 
by defining it to be equal to V on rant P. The enlarged U has the property 
that ||Ux|| = ||x|| for all z, which implies that it is unitary; since A = UP, 
everything falls into place. 

In the non-invertible case there is no hope of uniqueness and the arbi- 
trariness of the definition of U used in the proof shows why. For a concrete 
counterexample consider 


both the equations 
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and 


are polar decompositions of A. 


Solution 146. 146 


Yes, eigenvectors belonging to distinct eigenvalues of a normal transformation 
(on a finite-dimensional inner product space) must be orthogonal. The natural 
way to try to prove that is to imitate the proof that worked for Hermitian (and 
unitary) transformations. That is: assume that 


Av, = À1z£ı and Axo = A2X2, 


with A, Æ Ag, and look at 
(Axı, £2) = (x1, A* £2). 


The left term is equal to A1 (£1, £2)—s0 far, so good—but there isn’t any grip 
on the right term. Or is there? Is there a connection between the eigenvalues 
of a normal transformation and its adjoint? That is: granted that Ar = Az, 
can something intelligent be said about A*x? Yes, but it’s a bit tricky. 
The normality of A implies that 
\|Az|2 = (Av, Az) = (A* Az, 2) 
= (AA* x, £) = (A*a, A*x) = ||A*a||?. 
Since A — A is just as normal as A, and since 
(A—)* = A* -), 
it follows that 
||(A — A)z| = ||(A* — A)z|.- 

Consequence: if À is an eigenvalue of A with eigenvector x, then À is an 
eigenvalue of A* with the same eigenvector zx. 


The imitation of the proof that worked in the Hermitian case can now be 
comfortably resumed: since 


(Azı, £2) = Ai (£1, x2) 
and 


(a1, A*£2) = A2 (£1, £2), 


the distinctness of A; and A2 implies the vanishing of (x1, £2), and the proof 
is complete. 
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Solution 147. 


The answer is yes—normal transformations are diagonalizable. The key pre- 
liminary question is whether or not every restriction of a normal transformation 
is normal. That is: if A is normal on Y, if M is a subspace of V, and if Am 
is the restriction A|M of A to M (which means that Amz = Ax whenever 
x is in M), does it follow that Am is normal? The trouble with the question 
is that it doesn’t quite make sense—and it doesn’t quite make sense even 
for Hermitian transformations. The reason is that the restriction is rigorously 
defined, but it may not be a linear transformation on M—that is, it may fail 
to send vectors in M to vectors in M. For the question to make sense it must 
be assumed that the subspace is invariant under the transformation. All right, 
what if that is assumed? 

One good way to learn the answer is to write the transformation A under 
consideration as a 2 x 2 matrix according to the decomposition of the space 
into M and M+. The result looks like 


(23) 
0 x 

where P is the linear transformation Am on M and the asterisks are linear 
transformations from M to M+ (top right corner) and from M+ to M+ (bottom 
right corner). It doesn’t matter what linear transformations they are, and there 
is no point in spending time inventing a notation for them—what is important 
is the 0 in the lower left corner. The reason for that 0 is the assumed invariance 

of M under A. 
Once such a matrix representation is known for A, one for A* can be 


deduced: 
s= p 3 
* * 


Now use the normality of A in an easy computation: since 
ata (7 2) k aC? ‘) 

* Ox 0 x * * 
Aat = (4 G IE a 

0 x * ok * * 


normality implies that 


and 


P*P = PP*, 


that is, it implies that P is normal—in other words Am is normal. 
That’s all the hard work that has to be done—at this point the diago- 
nalizability theorem for normal transformations can be abandoned in good 
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conscience. The point is that intellectually the proof resembles the one for 
Hermitian transformations in every detail. There might be some virtue in 
checking the technical details, and the ambitious reader is encouraged to do 
so—examine the proof of diagonalizability for Hermitian transformations, re- 
place the word “Hermitian” by “normal”, delete all references to reality, and 
insist that the action take place on a complex inner product space, and note, 
happily, that the remaining parts of the proof remain unchanged. 


Language. The diagonalizability of normal (and, in particular, Hermitian) 
transformations is sometimes called the spectral theorem. 


Solution 148. 
If A and B are defined on C? by 


0 1 1 0 
A=(j J and B= (5 ae 


then B is normal and every eigenspace of A is invariant under B, but A and 
B do not commute. 

If, however, A is normal, and every eigenspace of A is invariant under 
B, then A and B do commute. The most obvious approach to the proof is 
to use the spectral theorem (Problem 147); the main purpose of that theorem 
is, after all, to describe the relation between a normal transformation and 
its eigenspaces. The assertion of the theorem can be formulated this way: if 
A is normal with distinct eigenvalues \;,...,A,, and if E; is, for each j, 
the (perpendicular) projection on the eigenspace corresponding to Aj, then 
A=>> j j£;. The assumption that the eigenspace corresponding to A; is 
invariant under B can be expressed in terms of E; as the equation 


BE; = E;BE;. 


From the assumption that every eigenspace of A is invariant under B it 
follows that the orthogonal complement of the eigenspace corresponding to 
Aj is invariant under B (because it is spanned by the other eigenspaces), and 
hence that 


B(1 — E;) = (1 — E;)B(1 — 55). 
The two equations together simplify to 
BE; = E;B, 


and that, in turn implies the desired commutativity BA = AB. 


148 


149 


150 


318 LINEAR ALGEBRA PROBLEM BOOK 


Solution 149. 


There are three ways for two of three prescribed linear transformations A, 
B, and C to be adjoints of one another; the adjoint pairs can be (A, B), or 
(B,C), or (A, C}. There are, therefore, except for notational differences, just 
three possible commutativity hypotheses: 


A with A* and A* with C, 
A with B and B with B*, 
A with B and B with A*. 


The questioned conclusion from the last of these is obviously false; for a 
counterexample choose A so that it is not normal and choose B = 0. The 
implications associated with the first two differ from one another in notation 
only; both say that if something commutes with a normal transformation, then 
it commutes with the adjoint of that normal transformation. That implication 
is true. 

The simplest proof uses the fact that if A is normal, then a necessary 
and sufficient condition for AB = BA is that each of the eigenspaces of 
A is invariant under B (see Solution 148). Consequence: if A is normal and 
AB = BA, then the eigenspaces of A are invariant under B. The normality of 
A implies that the eigenspaces of A are exactly the same as the eigenspaces of 
A*. Consequence: the eigenspaces of A* are invariant under B. Conclusion: 
A*B = BA*, and the proof is complete. 


Solution 150. 


Almost every known proof of the adjoint commutativity theorem (Solution 
143) can be modified to yield the intertwining generalization: it is indeed true 
that if A and B are normal and AS = SB, then A*S = SB*. Alternatively, 
there is a neat derivation, via matrices whose entries are linear transformations, 
of the intertwining version from the commutative one. Write 


A 0 0 S 
A A 
i ie a ang a ae 7 


The transformation A^ is normal, and a straightforward verification proves 
that B^ commutes with it. The adjoint commutativity theorem implies that 
B^ commutes with A^* also. To get the desired conclusion from this fact, just 
multiply the matrices A^* and B^ in both orders and compare corresponding 
entries. 
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Solution 151. 


Yes; if A, B, and AB are all normal, then BA is normal too. One good 
way to prove that statement is a splendid illustration of what is called a trace 
argument. In general terms, a trace argument can sometimes be used to prove 
an equation between linear transformations, or, what comes to the same thing, 
to prove that some linear transformation C' is equal to 0, by proving that the 
trace of C*C is 0. Since C*C is positive, the only way it can have trace 0 is 
to be 0, and once C*C' is known to be 0 it is immediate that C itself must 
be 0. The main techniques available to prove that the trace of something is 0 
are the additivity of trace, 


tr(X + Y)=trxX +trY, 


and the invariance of the trace of a product under cyclic permutations of its 
factors, 


tr(XY Z) = tr(ZXY). 


If it could be proved that A and B must commute, then all would be well 
(see the discussion preceding the statement of the problem), but that is not 
necessarily true (see the discussion preceding the statement of the problem). A 
step in the direction of commutativity can be taken anyway: the assumptions 
do imply that B commutes with A* A. That is: if C = BA* A — A* AB, then 
C = 0. That’s where the trace argument comes in. 

A good way to study C*C is to multiply out 


(A* AB* — B* A* A)(BA*A— A*AB), 
getting 
A* AB* BA* A — B* A* ABA* A — A* AB* A* AB + B* A* AA* AB, 
and then examine each of the four terms. As a device in that examination, 
introduce an ad hoc equivalence relation, indicated by X ~ Y for any two 
products X and Y, if they can be obtained from one another by a cyclic 
permutation of factors. A curious thing happens: the assumptions (A, B, and 


AB are normal) and the cyclic permutation property of trace imply that all 
four terms are equivalent to one another. Indeed: 


A* AB* BA* A = A* ABB* A*A (because B is normal) 
~ A* B*A* ABA (because AB is normal), 

B* A* ABA* A = B* A* ABAA* (because A is normal) 
~ A* B* A* ABA, 
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A* AB* A* AB = AA*B* A* AB (because A is normal) 
~ A* B* A* ABA, 
B* A* AA* AB ~ AA* ABB* A* 
= AA*B*A*AB (because AB is normal) 
~ A* B* A* ABA. 
Consequence: all four terms have the same trace, and, therefore, the trace of 
C*C is 0. 
The result of the preceding paragraph implies that B commutes with 
A* A. If A = UP is the polar decomposition of A, then U commutes with P 


(because A is normal), and, since B commutes with P? (= A* A), it follows 
that B also commutes with P. These commutativities imply that 


U*(AB)U =U*(UP)BU = (U*U)(BP)U = B(UP) = BA. 


Conclusion: BA is unitarily equivalent to the normal transformation AB, and, 
consequently, BA itself must be normal. 


Solution 152. 


(a) The adjoint of a matrix is its conjugate transpose. Polynomials are not 
clever enough to transpose matrices. If, for instance, 


0 0 
a 


then every polynomial in A is of the form 


Ga) 


which has no chance of being equal to 


of eA 
re ae 


Question. What made this A work? Would any non-normal A work just as 
well? 

(b) This time the answer is yes; the inverse of an invertible matrix A can 
always be obtained via a polynomial. For the proof, consider the characteristic 
polynomial 


A Seg cag KP +- HH aÀ Gg 
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of A and observe that ap cannot be 0. Reason: the assumed invertibility of A 
implies that 0 is not an eigenvalue. Multiply the Hamilton-Cayley equation 
A” +an-1 A™ 1 4---+a,A+a9 =0 
by AT! to get 
A”! + an-1 4”? +- -+a haga: = 0. 


Conclusion: if 
1 
p(A) = TF (APTT + anA"? +- +a), 
0 


then p(A) = A7?. 


Solution 153. 


The answer is that all positive matrices are Gramians. Suppose, indeed, that 
A 2 0 and infer (Problem 144) that there exists a positive matrix B such that 
B? = A. If A = (aij), then the equations 
Qij = (Aej, ei) (why?) 
= (B°ej, ei) = (Bej, Bei) 
imply that A is a Gramian (the Gramian of the vectors Be1, Be2,...), and 
that’s all there is to it. 


Solution 154. 


Squaring is not monotone; a simple counterexample is given by the matrices 


1 0 2 1 
A=(j and G i) 


The relation A < B can be verified by inspection. Since 


ael 5) (=A) and pel a 


4 3 
T o A 
B? A e J 


it is also easy to see that the relation A? < B? is false; indeed, the determinant 
of B? — A? is negative. 

Is it a small blemish that not both the matrices in this example are in- 
vertible? That’s easy to cure (at the cost of an additional small amount of 
computation): the matrices A + 1 and B + 1 are also a counterexample. 


so that 
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That’s the bad news; for square roots the news is good. That is: if 
OS ASB, 
then it is true that 
VAS vB. 


Various proofs of that conclusion can be constructed, but none of them jumps 
to the eye—the only way to go is by honest toil. The idea of one proof is 
to show that every eigenvalue of VB — VA is non-negative; for invertible 
Hermitian transformations that property is equivalent to positiveness. All right: 
suppose then that \ is an eigenvalue of VB — VA, with corresponding (non- 
zero) eigenvector x, so that 


VAz = vV Bz — Xz; 


it is to be shown that \ = 0. 

If it happens that V Bx = 0, then, of course, Bx = 0, and therefore 
it follows from the assumed relation between A and B that (Az, x) = 0. 
Consequence: Ax = 0. Reason: 


0 = (Ar, x) = (VAV Az, 2) = (VAx, VAr) = ||VĀz|}?. 


Once that’s known, then the assumed eigenvalue equation implies that Ax = 0, 
and hence that À = 0. 

If /Bx ¥ 0, then (Bx, x) 4 0—to see that apply the chain of equa- 
tions displayed just above to B instead of A. Consequence: 


(VBx, VBx) = ||VBrl|? 
2 ||VBe||-||WAz|| (why?) 
> (VBx,VAr) (why?) 
= (V Bz, VBz — Ax) 
= (V Bz, VBx) — \(VBz, x). 
Conclusion: À > 0, because the contrary possibility yields the contradiction, 
(VBx, V Bz) > (WBzx, vV Bz). 


The proof is complete. 


Solution 155. 


In some shallow combinatorial sense there are 32 cases to examine: 16 ob- 
tained via combining the four constituents ran A, ker A, ran A*, and ker A* 
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with one another by spans, and 16 others via combining them by intersections. 
Consideration of the duality given by orthogonal complements (and other, even 
simpler eliminations) quickly reduce the possibilities to two, namely 


ker A N ker A* and ran A N ran A*. 


The first of these is always a reducing subspace; indeed both A and 
A* map it into {0}. An explicit look at the duality can do no harm: since 
the orthogonal complement of a reducing subspace is a reducing subspace, 
it follows that ran A + ran A* is always a reducing subspace. This corollary 
is just as easy to get directly: A maps everything, and therefore in particular 
ran A + ran A*, into ran A ( which is included in ran A + ran A*), and a 
similar statement is true about A*. 

The second possibility, ran ANran A’, is not always a reducing subspace. 
One easy counterexample is given by 


Its range consists of all vectors of the form (a, 3,0), and the range of its 
adjoint consists of all vectors of the form (0, 8, y}. The intersection of the 
two ranges is the set of all vectors of the form (0, 3,0), which is not only 
not invariant under both A and A%*, but, in fact, is invariant under neither. 
The dual is ker A + ker A*, which in the present case consists of the set of 
all vectors of the form (a, 0, y}, not invariant under either A or A*. 


Solution 156. 


The only eigenvalue of A is 0 (look at the diagonal of the matrix). If £ = 
(a1, Q2,--.-, Qn), then 


Ax = (0, a1, @2,...,Q@n-1)} 


it follows that Ax = 0 if and only if x is a multiple of 


Ln = (0,0,...,0,1). 


That is: although the algebraic multiplicity of 0 as an eigenvalue of A is n 
(the characteristic polynomial of A is (—A)"), the geometric multiplicity is 
only 1. One way to emphasize the important one of these facts is to say that 
the subspace M; consisting of all multiples of £n is the only 1-dimensional 
subspace invariant under A. 

Are there any 2-dimensional subspaces invariant under A? Yes; one of 
them is the subspace Mə spanned by the last two basis vectors x,_, and 
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£n (or, equivalently, the subspace consisting of all vectors whose first n — 2 
coordinates vanish). That, moreover, is the only possibility. Reason: every 
such subspace has to contain £n (because it has to contain an eigenvector), 
and, since A is nilpotent, the restriction of A to each such subspace must be 
nilpotent (of index 2). It follows that each such subspace must contain at least 
one vector y in Mo that is not in Mı, and hence (consider the span of y and 
Zn) must coincide with Mə. 

The rest of the proof climbs up an inductive ladder. If Mẹ is the subspace 
spanned by the last k vectors of the basis {21,x2,...,2n} (or, equivalently, 
the subspace consisting of all vectors whose first n — k coordinates vanish), 
then it is obvious that each M is invariant under the truncated shift A, and by 
a modification of the argument of the preceding paragraph (just keep raising 
the dimensions by 1) it follows that Mẹ is in fact the only invariant subspace 
of dimension k. (Is it permissible to interpret Mo as {0}?) 

Conclusion: the number of invariant subspaces is n + 1, and the number 
of reducing subspaces is 2; the truncated shift is irreducible. 


Solution 157. 
The matrix A is the direct sum of the 2 x 2 matrix 
0 1 
B = 
(o 0) 


and the 1 x 1 matrix 0. A few seconds’ reflection should yield the conclusion 
that the same direct sum statement can be made about 


001 
A^=|0 0 0]; 
00 0 


the only difference between A and A^ is that for A^ the third column plays 
the role that the second column played for A. A more formal way of saying 
that is to say that the permutation matrix that interchanges the second and the 
third columns effects a similarity between A and A^: 


1 0 0 0 1 0 1 0 0 0 0 1 
0 0O 1y]-{0 0 OJ]-|0 0 1T]=1{0 0 0 
0 1 0 0 0 0 0 1 0 0 0 0 


Since A^ does have a square root, namely the 3 x 3 truncated shift, so does 
A. Since in fact, more generally, 


S'S > 
oom 
Omre 3 


SOLUTIONS: Chapter 9 325 


is a square root of A^, it follows that A too has many square roots, namely 
the matrices of the form 


0. pe E 

00 0 
1 

0 4 0 


obtained from the square roots of A^ by the permutation similarity. 


Solution 158. 158 


Similar normal transformations are unitarily equivalent. Suppose, indeed, that 
A, and A» are normal and that 


A,B = B4», 


where B is invertible. Let B = UP be the polar decomposition of B (Problem 
145), so that U is unitary and P = v B* B, and compute as follows: 


Aə(B* B) = (A2B*)B 
= (B* Aı)B 
(by the facts about adjoint intertwining, Solution 150) 
= B*(Aı B) = B*(BA2) (by assumption) 
= (B* B) A2. 
The result is that 
AP? = P? Ao, 


from which it follows (since P is a polynomial in P?) that 


AP = PAo. 
Consequence: 
AUP = UPA (by assumption) 
=UAP (by what was just proved), 
and therefore, since P is invertible, 
AU = U Ao. 


That completes the proof of the unitary equivalence of A; and Ao. 
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Solution 159. 


Are the matrices 


0 1 0 0 0 0 
A=|{0 0 2 and B= {2 0 0 
0 0 0 0 1 0 
unitarily equivalent? The answer is yes, and it’s not especially surprising; if 
0 0 1 
U={0 1 O], 
1 0 0 
then U* AU = B. 
Are the matrices 
0 1 0 0 2 0 
A={0 0 2 and B={0 01 
0 0 0 0 0 0 


unitarily equivalent? The surprising answer is no. More or less sophisticated 
proofs for that negative answer are available, but the quickest proof is a simple 
computation that is not sophisticated at all. What can be said about a 3 x 3 
matrix S with the property that 


SA = BS? 


Written down in terms of matrix entries, the question becomes a system of 
nine equations in nine unknowns. The general solution of the system is easy 
to find; the answer is that the matrix S must have the form 


2 0 n 
S=|0 ¿ 0 
0 0 2€ 


A matrix like that cannot possibly be unitary, and that settles that. 
An alternative proof is based on the observation that 


0 0 2 
4 =B=|0 00 
0 0 0 


Since SA = BS implies that SA? = B? S, it becomes pertinent to find out 
which matrices commute with A?. That’s another simple computation, which 
leads to the same conclusion. 

These comments seem not to address the main issue (unitary equivalence 
of transposes), but in fact they come quite close to it. The A’s in the two pairs 
of examples are the same, but the B’s are not: the first B is the transpose of 
the second. Since the first B is unitarily equivalent to A but the second one 
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is not, since, in fact, the second B is unitarily equivalent to the transpose of 
A, it follows that A is not unitarily equivalent to its own transpose, and that 
settles the issue. 

Yes, it settles the issue, but not very satisfactorily. How could one possibly 
discover such examples, and, having discovered them, how could one give a 
conceptual proof that they work instead of an unenlightening computational 
one? 

Here is a possible road to discovery. What is sought is a matrix A that is 
not unitarily equivalent to the transpose A. Write A in polar form A = UP, 
with U unitary and P positive (Problem 145), and assume for the time being 
that P is invertible. There is no real loss of generality in that assumption; if 
there is any example at all, then there are both invertible and non-invertible 
examples. Proof: the addition of a scalar doesn’t change the unitary equiv- 
alence property in question. Since, moreover, transforming every matrix in 
sight by a fixed unitary one doesn’t change the unitary equivalence property 
in question either, there is no loss of generality in assuming that the matrix 
P is in fact diagonal. 

If A = UP, then A’ = PU’, so that to say that A and A’ are unitarily 
equivalent is the same as saying that there exists a unitary matrix W such 
that 


W*(UP)W = Pus (x) 
or, equivalently, such that 
(W*U)P(WU) = P. 


(The symbol U here denotes the complex conjugate of the matrix U.) Assume 
then that (*) is true, and write Q = W*U, and R = WU; note that Q and 
R are unitary and that 


QPR=P. 
It follows that 
P? = PP* == QPRR*PQ* = QP?°Q*, 


so that Q commutes with P?; since P is a polynomial in P?, it follows that 
Q commutes with P (and similarly that R commutes with P). 

To get a powerful grip on the argument, it is now a good idea to make a 
restrictive assumption: assume that the diagonal entries (the eigenvalues) of 
P are all distinct. In view of the commutativity of Q and P, that assumption 
implies that Q too is diagonal and hence, incidentally, that W = UP?. The 
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equation (*) yields 
PQUQ = QPUQ = PU’, 
and hence, since P is invertible, that 


QUQ =U". 
Since the entries of the unitary diagonal matrix Q are complex numbers of 
absolute value 1, it follows that the absolute values of the matrix U constitute 
a symmetric matrix. 

That last result is unexpected but does not seem to be very powerful; in 
fact, it solves the problem. The assumption of the existence of W has implied 
that U must satisfy a condition. The matrix U, however, has not yet been 
specified; it could have been chosen to be a quite arbitrary unitary matrix. If 
it is chosen so as not to satisfy the necessary condition that the existence of 
W imposes, then it follows that no W can exist, and victory is achieved. 

The simplest example of a unitary matrix whose absolute values do not 
form a symmetric matrix is 


0 1 0 
U= {0 0 1 
1 0 0 


A simple P that can be used (positive, diagonal, invertible, with distinct 
eigenvalues) is given by 


Since, however, invertibility is an unnecessary luxury, an even simpler one is 
0 0 0 
P= |]0 1 0]; 
0 0 2 
if that one is used, then the resulting counterexample is 
0 1 0 0 0 0 0 1 0 
A=UP={|{0 01]-{0 1 0}/={0 0 2], 
1 0 0 0 0 2 0 0 0 


and the process of “discovery” is complete. 


Solution 160. 


If A and B are real, U is unitary, and U* AU = B, then there exists a real 
orthogonal V such that V* AV = B. 
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A surprisingly important tool in the proof is the observation that the 
unitary equivalence of A and B via U implies the same result for A* and 
B*. Indeed, the adjoint of the assumed equation is U* A*U = B*. 

Write U in terms of its real and imaginary parts (compare Solution 89): 
U = E + iF. It follows from AU = UB that AE = EB and AF = FB, 
and hence that A(E + AF) = (E + AF)B for every scalar A. If ÀA is real 
and different from a finite number of troublesome scalars (the ones for which 
det (E+ AF) = 0), the real matrix S = E + AF is invertible, and, of course, 
has the property that AS = SB. 

Proceed in the same way from U*A*U = B*: deduce that 
A*(E + AF) = (E + AF)B* for all A, and, in particular, for the ones for 
which E + AF is invertible, and infer that A*S = SB* (and hence that 
S* A* = BS"). 

From here on in the technique of Solution 158 works. Let S = VP be 
the polar decomposition of S (that theorem works just as well in the real case 
as in the complex one, so that V and P are real). Since 


BP? = BS*S = S* A*S = S*SB = P”B, 


so that P? commutes with B, it follows that P commutes with B. Since 


AVP=AS=SB=VPB=VBP 
and P is invertible, it follows that AV = V B, and the proof is complete. 


Solution 161. 161 


It is a worrisome fact that eigenvalues of absolute value 1 can not only stop 
the powers of a matrix from tending to 0, they can even make those powers 
explode to infinity. Example: if 


then 


Despite this bad omen, strict inequalities do produce the desired convergence. 

An efficient way to prove convergence is to use the Jordan canonical 
form. (Note that A” — 0 if and only if (S~'AS)” — 0.) The relevant part 
of Jordan theory is the assertion that (the Jordan form of) A is the direct sum 
of matrices of the form A + B, where B is nilpotent (of some index k). Since 


(A+B) =A" + ee NO Bato KA \n-k+1 pk-1 
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as soon as n 2 k — 1, and since the assumption |A| < 1 (strict inequality) 
implies that the coefficients tend to 0 as n — ov, the proof is complete. 


Solution 162. 


Yes, every power bounded transformation is similar to a contraction. 

Note first that if A is power bounded, then every eigenvalue of A is less 
than or equal to 1 in absolute value. (Compare the reasoning preceding the 
statement of Problem 161.) To get more powerful information, use the Jordan 
form to write (the matrix of) A as the direct sum of matrices of one of the 
forms 


» 0 0 0 OO 0 
i A00 000 
BS AD sie a oale 
0012 Orr OX 


where, for typographical convenience, 4 x 4 matrices are used to indicate the 

general n x n case. It is then enough to prove that each such direct summand 

that can actually occur in a power bounded matrix is similar to a contraction. 

Since |A| < 1, the matrix F is a contraction, and nothing else needs to 

be said about it. As far as E is concerned, two things must be said: first, 

|A| cannot be equal to 1, and, second, when |A| < 1, then Æ is similar to a 
contraction. 

As for the first, a direct computation shows that the entry in row 2, column 

1 of E” is nA” ~!; if |A| = 1, that is inconsistent with power boundedness. 

As for the second, F is similar to 

A 

E.=| 4 

0 0 


where £ can be any number different from 0. There are two ways to prove 
that similarity: brute force and pure thought. For brute force, form 


1 0 0 0 

02e 0 0 
= 00 2 0f’ 

00 0 & 


and verify that SEST! = E-. For pure thought, check, by inspection, that Æ 
and E; have the same elementary divisors, and therefore, by abstract similarity 
theory, they must be similar. 

The proof can now be completed by observing that if |A| < 1 and € is 
sufficiently small, then Es is a contraction. The quickest way of establishing 
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that observation is to recall that ||X|| is a continuous function of X, and 
that, therefore, ||£-|| is a continuous function of £. Since ||£o|| = |A| < 1, it 
follows that || £.|| < 1 when € is sufficiently small, and that settles everything. 


Solution 163. 


What is obvious is that some nilpotent transformations of index 2 can be 
reducible: just form direct sums. That can be done even in spaces of dimension 
3; the direct sum of a 0 (of size 1) and a nilpotent of index 2 (of size 2) is 
nilpotent of index 2 (and size 3). What is not obvious is that, in fact, on 
a space V of dimension greater than 2 every nilpotent transformation A of 
index 2 must be reducible. In the proof it is permissible to assume that A Æ 0 
(for otherwise the conclusion is trivial). 

(1) Y = ker A + ker A*. Reason: V = ran A + rant A; nilpotence of 
index 2 implies that ran A C ker A, and always rant A = ker A*. In the rest 
of the proof it is permissible to assume that 


ker A N ker A* = {0} 


(for if x # 0 but Ax = A*x = 0, then the span of x is a 1-dimensional 
reducing subspace). 

(2) The dimension of ker A* (the nullity of A*, abbreviated null A*) is 
strictly greater than 1. Since A and A* play completely symmetric roles in all 
these considerations, it is sufficient to prove that null A > 1 (and that way 
there is less notational fuss). Suppose, indeed, that rank A < 1. Since A £0 
by assumption, rank A must be 1 (not 0). Since ran A = A(ker~ A) and the 
restriction of A to kert A is a one-to-one transformation, it follows that 


dimkert A = dimran A = rank A = 1. 


Thus both ker A and ker A* have dimension 1, and hence V has dimen- 
sion 2 (see (1) above), contradicting the assumption that dimV > 2. This 
contradiction destroys the hypothesis null A < 1. 

(3) If x € ker A*, then A*Ax € ran A* C ker A*; in other words, 
the subspace ker A* is invariant under the Hermitian transformation A* A. 
It follows that ker A* contains an eigenvector of A* A, or, equivalently, that 
ker A* has a subspace N of dimension 1 that is invariant under A* A. 

(4) Consider the subspace M = N + AN. Since A maps N to AN and 
AN to {0} (recall that A? = 0), the subspace M is invariant under A. Since 
A* maps N to {0} (recall that N C ker A*) and AN to N, the subspace M is 
invariant under A*. Consequence: M reduces A. Since M D N, the dimension 
of M is not less than 1, and since M = N + AN, the dimension of M is not 
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more than 1 + 1. Conclusion: M is a non-trivial proper reducing subspace 
for A. 


Solution 164. 


Yes, a nilpotent transformation of index 3 can be irreducible on C*. One 
example, in a sense “between” the truncated shift and its square, is given by 
the matrix 
0 0 0 0 0 11 0 
A= ; ; : with adjoint A* = ; : : : 
0 1 0 0 00 0 0 
The kernel of A is the set of all vectors of the form x = (0,0, ~y, 6). 
These being the only eigenvectors (the only possible eigenvalue being 0), 
every non-trivial invariant subspace for A must contain one of them (other 
than 0). One way to establish that A is irreducible is to show that, for any x 
of the indicated form, the set consisting of x together with all its images under 
repeated applications of A and A* necessarily spans C+. Consider, indeed, 
the following vectors: 


oh = x = (0, 0,7, ô), 
yo = = (7, ô, 0, 0}, 
Y3 = = (0, 0, 0, 0), 
y = = (0, y, Y, ô), 
ys = mn = (0, ô, y, 0}, 


Yo = A? A*r = (0, 0, 0, y): 


It is true that no matter what y and ô are, so long as not both are 0, these 
vectors span the space. If y = 0, then yı, yo, y3, and y4 form a basis; if 
6 = 0, then y1, y2, ye, and a simple linear combination of y4 and y2 form a 
basis; if neither y nor ô is 0, then y3, ye, and simple linear combinations of 
yı and yg for one and of y2 and y3 for another form a basis. 

The question as asked is now answered, but the answer gives only a 
small clue to the more general facts (about possible irreducibility) for nilpotent 
transformations of index k on spaces of dimension n when k < n. The case 
k = 3 and n = 5 hints at the sort of thing that has to be looked at; the matrix 


0 0 0 0 0 
1 0 0 0 0 
1 0 0 0 0 
0 1 0 0 0 
0 0 2 0 0 


does the job in that case. 
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It should be emphasized that these considerations have to do with inner 
product spaces, where reduction is defined in terms of adjoints (or, equiv- 
alently, in terms of orthogonal complements). There is a purely algebraic 
theory of reduction (the existence for an invariant subspace of an invariant 
complement), and in that theory the present question is much easier to an- 
swer in complete generality. The structure theory of nilpotent transformations 
(in effect, the Jordan normal form), implies that the only chance a nilpotent 
transformation of index k on a space of dimension n has to be irreducible 
(that is: one of two complementary invariant subspaces must always be {0}) 
is to have k =n. 
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= Linear Algebra Problem Book can be either the 
£ : main course or the dessert for someone who 
needs linear algebra—and today that means 
every user of mathematics. It can be used 
as the basis of either an official course or a 
program of private study. If used as a course, 
the book can stand by itself, or if so desired, it 
can be stirred in with a standard linear algebra 
course as the seasoning that provides the inter- 
est, the challenge, and the motivation that is 
needed by experienced scholars as much as by 
beginning students. 


, i - EE The best way to learn is to do, and the purpose 


of this book is to get the reader to DO linear 
algebra. The approach is Socratic: first ask a question, then give a hint (if 
necessary), then, finally, for security and completeness, provide the detailed 
answer. 


Indeed, the author offers original insights illuminating the essence of the associative 
and distributive laws and the underlying algebraic structures (groups, fields, vector 
spaces). The core of the book is, of course, the study of linear transformations on 
finite-dimensional spaces. The problems are intended for the beginner, but some of 
them may challenge even an expert. 
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